Setting Up the IML Program |
After you formulate the GA optimization problem as described in the previous section, executing the genetic algorithm in IML is simple and straightforward. Remember that the current GA implementation in IML is experimental, and will be furthur developed and tested in later SAS releases. The following table summarizes the IML GA modules used to set each of the optimization parameters. IML will use reasonable default values for some of the parameters if they are not specified by the GA calls, and these default values are also listed. Parameters shown in italics are not required in all cases.
Type |
Set By |
Parameter |
Value |
---|---|---|---|
encoding |
GASETUP |
encoding |
general |
fixed-length real |
|||
fixed-length integer |
|||
fixed-length sequence |
|||
size |
fixed-length size |
||
seed |
initial random seed |
||
objective |
GASETOBJ |
id |
returned from GASETUP |
objtype |
minimize user module |
||
maximize user module |
|||
traveling salesman problem |
|||
parm |
if objtype=0 or 1, user module |
||
if objtype=2, cost coefficients |
|||
selection |
GASETSEL |
id |
returned from GASETUP |
elite |
integer in [0, population size] |
||
type |
conventional tournament |
||
dual tournament with BPW prob |
|||
parm |
if type = 0, tournament size |
||
if type = 1, real number in [0.5,1] |
|||
default if |
elite |
1 |
|
not set |
type |
conventional tournament |
|
parm |
2 |
||
crossover |
GASETCRO |
id |
returned from GASETUP |
crossprob |
crossover probability |
||
type |
user module |
||
simple |
|||
two-point |
|||
arithmetic |
|||
heuristic |
|||
pmatch |
|||
cycle |
|||
order |
|||
parm |
module name for type = 0 |
||
val if encoding=1, 0<type<3 |
|||
default if |
crossprob |
1.0 |
|
not set |
type |
heuristic if encoding=1 |
|
simple if encoding=2 |
|||
pmatch if encoding=3, objtype 0 |
|||
order if objtype=2 (TSP) |
|||
mutation |
GASETMUT |
id |
returned from GASETUP |
mutprob |
mutation probability |
||
type |
user module |
||
uniform |
|||
delta |
|||
swap |
|||
invert |
|||
parm |
delta value if type=2 |
||
number of swaps if type=3 |
|||
default if |
mutprob |
0.05 |
|
not set |
type |
uniform if encoding=1 or 2, bounded |
|
delta if encoding=1 or 2, no bounds |
|||
swap if encoding=3, not TSP |
|||
invert if objtype=1 (TSP) |
After setting the optimization parameters, you are ready to execute the GA. First, an initial solution population must be generated with a GAINIT call. GAINIT implements the initialization phase of the GA, generating an initial population of solutions and evaluating the objective value of each member solution. In the GAINIT call you specify the population size and any constant bounds on the solution domain. Next comes an IML loop that contains a GAREGEN call. GAREGEN implements the regeneration phase of the GA, which generates a new solution population based on selection, crossover, and mutation of the current solution population, then replaces the current population with the new population and computes the new objective function values.
After the GAREGEN call, you can monitor the convergence of the GA by retrieving the objective function values for the current population with the GAGETVAL call. You might check the average value of the objective population, or check only the best value. If the elite parameter is 1 or more, then it is easy to check the best member of the population, since it will always be the first member retrieved.
After your stopping criteria have been reached, you can retrieve the members of the solution population with the GAGETMEM call. To end the optimization, you should always use the GAEND call to free up memory resources allocated to the GA.
Below are some example programs to illustrate setting up and executing a genetic algorithm. The first example illustrates a simple program, a 10-city TSP using all IML defaults. The cost coefficients correspond to the cities being laid out on a two-by-five grid. The optimal route has a total distance of 10.
proc iml; /* cost coefficients for TSP problem */ coeffs = { 0 1 2 3 4 5 4 3 2 1, 1 0 1 2 3 4 5 4 3 2, 2 1 0 1 2 3 4 5 4 3, 3 2 1 0 1 2 3 4 5 4, 4 3 2 1 0 1 2 3 4 5, 5 4 3 2 1 0 1 2 3 4, 4 5 4 3 2 1 0 1 2 3, 3 4 5 4 3 2 1 0 1 2, 2 3 4 5 4 3 2 1 0 1, 1 2 3 4 5 4 3 2 1 0 }; /* problem setup */ id = gasetup(3, /* 3 -> integer sequence encoding */ 10, /* number of locations */ 1234 /* initial seed */ ); /* set objective function */ call gasetobj(id, 2, /* 2 -> Traveling Salesman Problem */ coeffs /* cost coefficient matrix */ ); /* initialization phase */ call gainit(id, 100 /* initial population size */ ); /* execute regeneration loop */ niter = 20; /* number of iterations */ bestValue = j(niter,1); /* to store results */ call gagetval(value, id, 1); /* gets first value */ bestValue[1] = value; do i = 2 to niter; call garegen(id); call gagetval(value, id, 1); bestValue[i] = value; end; /* print solution history */ print (t(1:niter))[l = "iteration"] bestValue; /* print final solution */ call gagetmem(bestMember, value, id, 1); print "best member " bestMember [f = 3.0 l = ""],, "final best value " value [l = ""]; call gaend(id);
For this test case, there is no call to GASETSEL. Therefore IML will use default selection parameters: an elite value of 1 and a conventional tournament of size 2. Also, since there is no GASETCRO or GASETMUT call, IML will use default genetic operators: the order operator for crossover and the invert operator for mutation, and a default mutation probability of 0.05. The output results are
iteration BESTVALUE 1 18 2 18 3 16 4 14 5 14 6 14 7 12 8 12 9 12 10 12 11 12 12 12 13 12 14 12 15 12 16 12 17 12 18 12 19 10 20 10 best member 10 1 2 3 4 5 6 7 8 9 final best value 10
The optimal value was reached after 19 iterations. Because the elite value was 1, the best solution was retained and passed on to each successive generation, and therefore never lost. Note that out of 3,628,800 possible solutions (representing 362,800 unique paths), the GA found the optimum after only 1,900 function evaluations, without using any problem-specific information to assist the optimization. You could also do some experimentation, and specify different genetic operators with a GASETCRO and GASETMUT call, and different selection parameters with a GASETSEL call:
/* alternate problem setup */ id = gasetup(3, /* 3 -> integer sequence encoding */ 10, /* number of locations */ 1234 /* initial seed */ ); /* set objective function */ call gasetobj(id, 2, /* 2 -> Traveling Salesman Problem */ coeffs /* cost coefficient matrix */ ); call gasetcro(id, 1.0, /* crossover probabilty 1 */ 5 /* 5 -> pmatch operator */ ); call gasetmut(id, 0.05, /* mutation probability */ 3 /* 3 -> swap operator */ ); call gasetsel(id, 3, /* set elite to 3 */ 1, /* dual tournament */ 0.95 /* best-player-wins probability 0.95 */ ); /* initialization phase */ call gainit(id, 100 /* initial population size */ ); /* execute regeneration loop */ niter = 15; /* number of iterations */ bestValue = j(niter,1); /* to store results */ call gagetval(value, id, 1); /* gets first value */ bestValue[1] = value; do i = 2 to niter; call garegen(id); call gagetval(value, id, 1); bestValue[i] = value; end; /* print solution history */ print (t(1:niter))[l = "iteration"] bestValue; /* print final solution */ call gagetmem(bestMember, value, id, 1); print "best member " bestMember [f = 3.0 l = ""],, "final best value " value [l = ""]; call gaend(id);
The output of this test case is
iteration BESTVALUE 1 24 2 18 3 18 4 16 5 16 6 14 7 14 8 14 9 14 10 12 11 12 12 12 13 10 14 10 15 10 best member 3 4 5 6 7 8 9 10 1 2 final best value 10
Note that convergence was faster than for the previous case, reaching an optimum after 13 iterations. This illustrates that the convergence of a GA may be very sensitive to the choice of genetic operators and selection parameters, and for practical problems some experimental fine-tuning of optimization parameters may be required to obtain acceptable convergence.