The %MktEx autocall macro creates efficient factorial designs. It can handle simple problems like main-effects designs and more complicated problems, including designs that have interactions and restrictions on which levels can appear together. For most simple problems, you need to specify only the levels of all the factors and the number of runs. For more complicated problems, you might also need to specify the interactions that you want to estimate or any restrictions that you want to impose on the design.
The macro uses a variety of methods and works iteratively as it attempts to optimize the D-efficiency of the design. As D-efficiency increases, the standard errors of the parameter estimates in the linear model decrease. A perfect design is orthogonal and balanced and has 100% D-efficiency. A design is orthogonal when all the parameter estimates are uncorrelated. A design is balanced when all the levels within each of the factors occur equally often. A design is orthogonal and balanced when the variance matrix, which is proportional to , is diagonal, where is a suitable orthogonal coding of the design matrix.
By default, the %MktEx macro creates the following output data sets to contain the final design:
OUT=Design
, the experimental design, sorted by the factor levels
OUTR=Randomized
, the randomized experimental design
The two designs are equivalent and have the same D-efficiency. The OUT=Design
data set is sorted and is usually easier to inspect visually; however, the OUTR=Randomized
design is usually the better one to use. The randomized design has rows that are sorted in random order and factor levels that are randomly reassigned.
%MktEx( list, N=n, <, optional arguments>)
The following are the most commonly used arguments.
You can use the following two arguments (RESLIST= and RESMAC=) jointly to set up some constant matrices that the restrictions macro can use for certain complicated restrictions. Because the restrictions macro is called often, anything that you can do only once speeds up the algorithm. You can also use these arguments to access a %MktEx macro matrix in your restrictions macro that you normally could not access. This requires knowledge of the internal workings of the %MktEx macro, so it is not a capability that you usually need.
The following arguments specify the names of the input and output data sets.
The following group of arguments control some of the details of the %MktEx macro’s iterations. The macro can perform three sets of iterations: the algorithm search, the design search, and the design refinements.
The algorithm search iterations look for efficient designs by using three different approaches. The macro then determines which approach appears to be working best and uses that approach exclusively in the design search iterations. The design refinement iterations try to refine the best design that has been found by using level exchanges together with random mutations and simulated annealing. Some of these iteration arguments accept up to three parameters, one for each set of iterations.
The following are miscellaneous arguments that you might occasionally find useful.
The following are all the other miscellaneous arguments. You will rarely specify arguments from this list.
You can specify either of the following to display the argument names and simple examples of the macro syntax:
%mktex(help) %mktex(?)
The %MktEx macro displays notes in the SAS log to show you what it is doing while it is running. Most of the notes that would usually come out of the macro’s procedure and DATA steps are suppressed by default by an options nonotes
statement. This macro specifies options nonotes
throughout most of its execution. If you want to see all the notes, submit the statement %let mktopts = notes;
before running the macro. To see the macro version, submit the statement %let mktopts = version;
before running the macro. This section describes the notes that are usually not suppressed.
The macro usually starts by displaying one of the following notes (filling in a value after N=):
NOTE: Generating the Hadamard design, n=. NOTE: Generating the full-factorial design, n=. NOTE: Generating the fractional-factorial design, n=. NOTE: Generating the orthogonal array design, n=.
These messages tell you which type of orthogonal design the macro is constructing. The design might be the final design, or it might provide an initialization for the coordinate exchange algorithm. In some cases, it might not have the same number of runs, n, as the final design. This step is usually fast, but constructing some fractional-factorial designs can be time-consuming.
The macro displays the following note when it is going to use PROC OPTEX to search a candidate set:
NOTE: Generating the candidate set.
This step is usually fast. Next, when a candidate set is searched, the macro displays the following note, substituting values for the ellipses:
NOTE: Performing ... searches of ... candidates.
This step can take some time, depending on the size of the candidate set and the size of the design. When there are a lot of restrictions and a fractional-factorial candidate set is being used, the candidate set might be so restricted that it does not contain enough information to make the design. In that case, you will get the following message:
NOTE: The candidate-set initialization failed, but the MKTEX macro is continuing.
Even though part of the macro’s algorithm failed, it is not a problem. The macro just goes on to the coordinate-exchange algorithm, which will almost certainly work better than searching any severely restricted candidate set.
For large designs, you will usually want to skip the PROC OPTEX iterations. The macro might display the following note:
NOTE: With a design this large, you may get faster results with OPTITER=0.
Sometimes you will get the following note:
NOTE: Stopping since it appears that no improvement is possible.
When the macro repeatedly finds the same maximum D-efficiency in different designs, it might stop early. This might mean that the macro has found the optimal design, or it might mean that the macro keeps finding a very attractive local optimum. Either way, it is unlikely to improve. You can control this action by using the STOPEARLY= argument.
The macro has arguments that control the amount of time that it spends trying different techniques. When time expires, the macro might switch to other techniques before it completes the usual maximum number of iterations. When this happens, the macro displays the following notes:
NOTE: Switching to a random initialization after ... minutes and ... designs. NOTE: Quitting the algorithm search after ... minutes and ... designs. NOTE: Quitting the design search after ... minutes and ... designs. NOTE: Quitting the refinement step after ... minutes and ... designs.
When there are restrictions, or when you specify that you do not want duplicate runs, you can also specify OPTIONS=ACCEPT. This means that you are willing to accept designs that violate the restrictions. When you specify OPTIONS=ACCEPT, the macro displays the following notes to tell you if the restrictions are not met:
NOTE: The restrictions were not met. NOTE: The design has duplicate runs.
The %MktEx macro optimizes a ridged efficiency criterion; that is, a small number is added to the diagonal of . Usually, the ridged criterion is virtually the same as the unridged criterion. When the %MktEx macro detects that this is not true, it displays the following notes:
NOTE: The final ridged D-efficiency criterion is .... NOTE: The final unridged D-efficiency criterion is ....
The macro ends with one of the following two messages:
NOTE: The MKTEX macro used ... seconds. NOTE: The MKTEX macro used ... minutes.
To impose restrictions on the design, you write a macro that creates a variable called Bad
that contains a numerical summary of how bad the row of the design is. When everything is fine, you set Bad
to 0. Otherwise, you set Bad
to a larger value that is a function of the number of restriction violations. The Bad
variable must not be binary (0 – OK, 1 – bad) unless there is only one simple restriction. You must set Bad
so that the %MktEx macro knows whether the changes that it is considering are moving the design in the right direction. The macro must consist of PROC IML statements and possibly some SAS macro language statements.
When you have restrictions, you can specify OPTIONS=RESREP to get a report on the restriction violations in the iteration history. This can be a great help when you debug your restrictions macro. Also, be sure to check the log when you specify the RESTRICTIONS= argument. The %MktEx macro cannot always ensure that your statements are free of syntax errors and stop if they are not. There are a number of macro arguments that you can use to impose restrictions, including RESTRICTIONS=, OPTIONS=NODUPS, BALANCE=, PARTIAL=, and INIT=. If you specify more than one of these arguments, be sure that the combination makes sense and that it is possible to simultaneously satisfy all the restrictions.
The %MktEx macro makes a number of scalars, a row vector, and a matrix available that you can use in your restictions macro to quantify badness. You refer to these quantities in your restrictions macro by using the following names:
I
– a scalar that contains the number of the row that is currently being changed or evaluated. If you are writing restrictions that use the variable I
, you should almost certainly specify OPTIONS=NOSORT.
Try
– a scalar similar to I
, which contains the number of the row that is currently being changed. However, Try
starts at 0 and is incremented for each row; it is set back to 0 when a new design starts, not when the %MktEx macro reaches the last row. Use I
as a matrix index and Try
to evaluate how far the %MktEx macro is into the process of constructing the design.
X
– a row vector of factor levels for row I
that always contains integer values that begin with 1 and continue on to the number of levels for each factor. These values are always one-based, even if the LEVELS= argument is specified.
X1
– the same as X[1]
; X2
is the same as X[2]
; and so on.
J1
– a scalar that contains the number of the column that is currently being changed. In the steps where the restrictions macro is called once per row, J1
= 1.
J2
– a scalar that contains the number of the other column that is currently being changed (along with J1
) when EXCHANGE=2. Both J1
and J2
are defined when the EXCHANGE= argument value is greater than or equal to 2. This scalar does not exist when EXCHANGE=1. In the steps where the restrictions macro is called once per row, J1
= J21
= 1.
J3
– a scalar that contains the number of the third column that is currently being changed (along with J1
and J2
) when the EXCHANGE= argument value is greater than or equal to 3. This scalar does not exist when EXCHANGE=1 and EXCHANGE=2. There will be a J4
, J5
, and so on, if and only if the EXCHANGE= argument value is greater than 3. In the steps where the restrictions macro is called once per row, J1
= J2
= J3
= 1.
Xmat
– the entire matrix. The ith row of is often different from because contains information about the exchanges being considered, whereas contains the current design.
Bad
– the variable that contains the number of violations of restrictions. You can make the value of this variable large or small, and you can use integers or real numbers. However, the values should always be nonnegative. When there are multiple sources of design badness, it is good to put the different sources on different scales so that they do not trade off against each other. For example, for the first source, you might multiply the number of violations by 1,000, by 100 for another source, by 10 for another source, by 1 for another source, and even sometimes by 0.1 or 0.01 for other sources. The final badness is the sum of Bad
, __pbad
(when it exists), and __bbad
(when it exists).
__pbad
– the badness from the PARTIAL= argument. This scalar does not exist when the PARTIAL= argument is not specified. You can weight this value in your restrictions macro (usually by multiplying it by a constant) to differentially weight the contributors to badness; for example: __pbad = __pbad * 10
.
__bbad
– the badness from the BALANCE= argument. This scalar does not exist when the BALANCE= argument is not specified. You can weight this value in your restrictions macro (usually by multiplying it by a constant) to differentially weight the contributors to badness; for example: __bbad = __bbad * 100
.
You cannot use these names (other than Bad
) for intermediate values! Otherwise, you can create intermediate variables without worrying about conflicts with the names in the macro. The levels of the factors for one row of the experimental design are stored in a vector X
; the first level is always 1, the second level is always 2, and so on. All restrictions must be defined in terms of X[j]
(or alternatively, X1
, X2
, , and perhaps the other matrices). For example, suppose there are five 3-level factors. If you want to define badness to mean that the level of a factor equals the level of the following factor, you can write the following macro, named RESTRICT, and specify RESTRICTIONS=RESTRICT when you invoke the %MktEx macro:
%macro restrict; bad = (x1 = x2) + (x2 = x3) + (x3 = x4) + (x4 = x5); %mend;
When you specify the RESTRICTIONS= argument, you specify just the macro name with no percent sign.
When you write a restrictions macro, there are a few facts about PROC IML that you should keep in mind. PROC IML does not have the same full set of Boolean operators that the DATA step and other parts of SAS have. For example, the following operators are not available in PROC IML: OR, AND, NOT, GT, LT, GE, LE, EQ, and NE. Also, although the expression a <= b <= c
is perfectly valid in PROC IML, its meaning is different from and less reasonable than its meaning in a DATA step. The DATA step expression checks to see whether b
is in the range of a
to c
. In contrast, the PROC IML expression a <= b <= c
is exactly the same as (a <= b) <= c
, which evaluates (a <= b)
and sets the result to 0 (false) or 1 (true). PROC IML then compares the resulting 0 or 1 to see whether it is less than or equal to c
.
The operators that you can use, along with their meanings, are as follows:
Do Not |
||
Specify |
For |
Specify |
|
Equal |
EQ |
or |
Not equal |
NE |
|
Less than |
LT |
|
Less than or equal to |
LE |
|
Greater than |
GT |
|
Greater than or equal to |
GE |
|
And |
AND |
|
Or |
OR |
or |
Not |
NOT |
|
Range check |
|
When you impose restrictions, the Current D-Efficiency column of the iteration history table can contain values that are larger than those in the Best D-Efficiency column. This is because the design that corresponds to the current D-efficiency might have restriction violations. Values are reported in the best D-efficiency column only after all the restriction violations have been removed. You can specify OPTIONS=ACCEPT together with the RESTRICTIONS= argument when it is acceptable if the restrictions are not met.
It is extremely important when imposing restrictions that you appropriately quantify the badness of the run. When the %MktEx macro considers an exchange, it has to know whether it is doing one of the following:
eliminating restriction violations, thus making the design better
causing more restriction violations, thus making the design worse
making a change that neither increases nor decreases the number of violations
Your restrictions macro must inform the %MktEx macro when it is making progress in the right direction. If it does not provide this information, the %MktEx macro will probably not find an acceptable design.
Consider designing a choice experiment with two alternatives, each composed of 25 attributes, the first 22 of which have restrictions on them. Attribute 1 in the choice design is made from X1
and X23
, attribute 2 in the choice design is made from X2
and X24
, and so on. The attributes greater than 22 are made from X45 – X50
. Each of the 25 choice-design attributes is made from the following pairs of factors:
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
… |
21 |
22 |
23 |
24 |
25 |
||
x1 |
x2 |
x3 |
x4 |
x5 |
x6 |
x7 |
x8 |
x9 |
x10 |
… |
x21 |
x22 |
x45 |
x46 |
x47 |
||
x23 |
x24 |
x25 |
x26 |
x27 |
x28 |
x29 |
x30 |
x31 |
x32 |
… |
x43 |
x44 |
x48 |
x49 |
x50 |
The restrictions are as follows: each choice attribute must contain two 1’s between five and nine times, each choice attribute must contain one 1 between five and nine times, and each choice attribute must contain two 2’s between five and nine times. When there are multicomponent restrictions such as these, it is easy to make mistakes when you are writing the restrictions macro. Even when your macro correctly differentiates rows that conform to the restrictions and rows that do not conform, the restrictions macro might not give the %MktEx macro enough guidance. Common mistakes include not quantifying the degree of badness in a row of the design. This section illustrates this and other problems that you might encounter when writing restrictions macros, and then shows one possible solution.
The following steps show an example of how not to program the restrictions:
%macro sumres; allone = 0; oneone = 0; alltwo = 0; do k = 1 to 22; if (x[k] = 1 & x[k+22] = 1) then allone = allone + 1; else if (x[k] = 1 & x[k+22] = 2) | (x[k] = 2 & x[k+22] = 1) then oneone = oneone + 1; else if (x[k] = 2 & x[k+22] = 2) then alltwo = alltwo + 1; end; * Bad example. Need to quantify badness.; bad = (^((5 <= allone & allone <= 9) & (5 <= oneone & oneone <= 9) & (5 <= alltwo & alltwo <= 9))); %mend; %mktex(3 ** 50, /* 50 three-level factors */ n=135, /* 135 runs */ restrictions=sumres, /* name of restrictions macro */ seed=289, /* random number seed */ options=resrep /* restrictions report */ quickr /* very quick run with random init */ nox) /* suppresses x1, x2, x3 ... creation */
The macro begins by initializing three counts to 0. The scalar Allone
is the count of the number of choice-design attributes that are all 1’s, Onelone
is the count of the number of choice-design attributes that have one 1 and one 2, and Alltwo
is the count of the number of choice-design attributes that have two 2’s. The DO loop runs through the 22 attributes and increments each of the three counts every time one of the desired patterns is found. Everything in and above the DO loop is fine. The problem occurs next. The assignment statement after the DO loop creates the scalar Bad
and sets it to 1 when the counts are not in the required range and 0 otherwise. This assignment statement stores the results of a Boolean expression. When the scalars Allone
, Onelone
, and Alltwo
are all in the right range, each of the inner expressions is true (1), as is the result of the three expressions and the two AND operations. The caret (NOT operator) converts true to false and false to true so that the scalar bad
is set to 0 when nothing bad happened and 1 when something bad happened.
In the %MktEx macro, you specify OPTIONS=RESREP to produce a report in the iteration history on the process of meeting the restrictions. When you run the %MktEx macro and it is having trouble making a design that conforms to restrictions, this report can be extremely helpful. Some of the results of this step are as follows:
Algorithm Search History Current Best Design Row,Col D-Efficiency D-Efficiency Notes ---------------------------------------------------------- 1 Start 59.7632 Ran,Mut,Ann 1 1 60.4198 1 Violations 1 2 61.0591 1 Violations 1 3 61.6890 1 Violations 1 4 62.4063 1 Violations 1 5 62.8670 1 Violations 1 6 63.6578 1 Violations 1 7 64.1437 1 Violations 1 8 64.4543 1 Violations 1 9 64.8995 1 Violations 1 10 65.6750 1 Violations . . . . 1 131 86.4159 1 Violations 1 132 86.4921 1 Violations 1 133 86.4709 1 Violations 1 134 86.5461 1 Violations 1 135 86.5328 1 Violations 1 1 86.5825 1 Violations WARNING: It might be impossible to meet all restrictions. . . . 1 133 89.4985 0 Violations . . .
The macro succeeds in eliminating restriction violations only in occasional rows. The problem with the preceding approach is that there are complicated restrictions but badness is binary. If all the counts are in the right range, badness is 0; otherwise it is 1. You need to write a macro that lets the %MktEx macro know when it is going in the right direction or it will probably never find a suitable design. You need to let it know, in an attribute with a 2 and a 1, that if it turns a 2 into a 1, it has taken a step toward increasing the Allone
count and toward decreasing the Oneone
count, and then appropriately update the badness criterion. This does not happen when badness is binary.
One thing that is correct about the preceding code is the compound Boolean range expressions like (5 <= allone & allone <= 9)
. Abbreviated expressions like (5 <= allone <= 9)
that work correctly in the DATA step work incorrectly and without warning in PROC IML. Another thing that is correct is the way the %SumRes macro creates new variables, K
, Allone
, Oneone
, and Alltwo
. Care was taken to avoid using names like I
and X
that conflict with the matrices that you can examine in quantifying badness. The full list of names that you must avoid are I
, Try
, X
, X1
, X2
through X
n for n factors, J1
, J2
, J3
, and Xmat
. The following steps show a slightly better but still misguided example of the macro:
%macro sumres; allone = 0; oneone = 0; alltwo = 0; do k = 1 to 22; if (x[k] = 1 & x[k+22] = 1) then allone = allone + 1; else if (x[k] = 1 & x[k+22] = 2) | (x[k] = 2 & x[k+22] = 1) then oneone = oneone + 1; else if (x[k] = 2 & x[k+22] = 2) then alltwo = alltwo + 1; end; * Better, badness is quantified, and almost correctly too!; bad = (^((5 <= allone & allone <= 9) & (5 <= oneone & oneone <= 9) & (5 <= alltwo & alltwo <= 9))) # (abs(allone - 7) + abs(oneone - 7) + abs(alltwo - 7)); %mend; %mktex(3 ** 50, /* 50 three-level factors */ n=135, /* 135 runs */ restrictions=sumres, /* name of restrictions macro */ seed=289, /* random number seed */ options=resrep /* restrictions report */ quickr /* very quick run with random init */ nox) /* suppresses x1, x2, x3 ... creation */
At first glance, this restrictions macro seems to do everything right—it quantifies badness—but a closer examination is warranted. The variables Allone
, Oneone
, and Alltwo
count the number of times that choice attributes are all one, have exactly one 1, or are all 2, respectively. Everything is fine when the all-one count is in the range 5 to 9: (5 <= allone & allone <= 9)
; and the exactly-one-1 count is in the range 5 to 9: (5 <= oneone & oneone <= 9)
; and the all-two count is in the range 5 to 9: (5 <= alltwo & alltwo <= 9)
. It is bad when this is not true: (((5 <= allone & allone <= 9) & (5 <= oneone & oneone <= 9) & (5 <= alltwo & alltwo <= 9)))
; the Boolean NOT operator "" performs the logical negation. This Boolean expression is 1 for bad and 0 for OK. It is multiplied by a sum of how far these counts are outside the correct range:
(abs(allone - 7) + abs(oneone - 7) + abs(alltwo - 7))
. When the row meets all the restrictions, this sum of absolute differences is multiplied by 0. Otherwise badness gets larger as the counts get further away from the middle of the 5-to-9 interval.
Some of the output from running the preceding macros follows:
Algorithm Search History Current Best Design Row,Col D-Efficiency D-Efficiency Notes ---------------------------------------------------------- 1 Start 59.7632 Ran,Mut,Ann 1 1 60.3423 6 Violations 1 2 60.7620 0 Violations 1 3 61.1314 4 Violations 1 4 61.6805 5 Violations 1 5 62.1363 0 Violations 1 6 62.5948 0 Violations 1 7 62.9039 4 Violations 1 8 63.1492 0 Violations 1 9 63.4927 4 Violations 1 10 63.8624 0 Violations 1 11 64.3188 5 Violations . . . . 1 12 64.7603 5 Violations 1 133 64.7106 3 Violations 1 134 64.3932 5 Violations 1 135 64.2997 0 Violations
These results are from the first pass through the design. When OPTIONS=RESREP, the %MktEx macro displays one line per row along with the number of violations when it is finished with the row. The macro is succeeding in eliminating violations in some but not all rows. This is the first thing that you should look for. If the macro is not succeeding in any rows, you might have written a set of restrictions that is impossible to satisfy. Some of the output from the second pass through the design is as follows:
1 1 64.3119 0 Violations 1 2 64.3272 0 Violations 1 3 64.4162 0 Violations 1 4 64.6035 0 Violations 1 5 64.6395 0 Violations 1 6 64.7770 0 Violations 1 7 64.9521 0 Violations . . . 1 28 66.8037 4 Violations 1 28 66.6859 0 Violations . . . 1 69 68.7250 5 Violations 1 69 68.6878 5 Violations 1 69 68.7250 5 Violations 1 69 68.6789 5 Violations 1 69 68.7250 5 Violations . . . 1 69 68.7114 5 Violations 1 69 68.7114 5 Violations . . . 1 133 71.9063 0 Violations 1 134 71.9149 0 Violations 1 135 71.9333 0 Violations
In the second pass, the %MktEx macro tries extra hard to impose restrictions in situations where it had some reasonable success in the first pass. You can see that it is trying over and over again without success to impose the restrictions in the 69th row. You can also see that it has no trouble removing all violations in the 28th row that were still there after the first pass. The %MktEx macro produces volumes of output like this. For several iterations, it will devote extra attention to rows that have some violations, but in this case without complete success. When you see this pattern—some success but also some stubborn rows that the %MktEx macro cannot fix—there might be something wrong with your restrictions macro. Are you really telling the %MktEx macro when it is doing a better job? The preceding steps illustrate some of the things that can go wrong with restrictions macros. It is important to carefully evaluate the results—look at the design, look at the iteration history, specify OPTIONS=RESREP, and so on—to ensure that your restrictions are doing what you want. In this example, the problem is the quantification of badness in the following statement:
bad = (^((5 <= allone & allone <= 9) & (5 <= oneone & oneone <= 9) & (5 <= alltwo & alltwo <= 9))) # (abs(allone - 7) + abs(oneone - 7) + abs(alltwo - 7));
There are three nonindependent contributors to the badness function: the three counts (Allone
, Oneone
, and Alltwo
). As a factor level changes, one count can increase while another decreases. There is a larger problem too. Suppose that Allone
and Oneone
are in the correct range but Alltwo
is not. Then the function fragments abs(Allone - 7)
and abs(Oneone - 7)
incorrectly contribute to the badness function. The solution is to clearly differentiate the three sources of badness and to weight the pieces so that one part never trades off against the others. For example, consider the following modified restrictions macro:
%macro sumres; allone = 0; oneone = 0; alltwo = 0; do k = 1 to 22; if (x[k] = 1 & x[k+22] = 1) then allone = allone + 1; else if (x[k] = 1 & x[k+22] = 2) | (x[k] = 2 & x[k+22] = 1) then oneone = oneone + 1; else if (x[k] = 2 & x[k+22] = 2) then alltwo = alltwo + 1; end; bad = 100 # (^(5 <= allone & allone <= 9)) # abs(allone - 7) + 10 # (^(5 <= oneone & oneone <= 9)) # abs(oneone - 7) + (^(5 <= alltwo & alltwo <= 9)) # abs(alltwo - 7); %mend; %mktex(3 ** 50, /* 50 three-level factors */ n=135, /* 135 runs */ restrictions=sumres, /* name of restrictions macro */ seed=289, /* random number seed */ options=resrep /* restrictions report */ quickr /* very quick run with random init */ nox) /* suppresses x1, x2, x3 ... creation */
In this version of the restrictions macro, a component of badness contributes to the function only when it is really part of the problem. The first part has a weight of 100, and the second part has a weight of 10. Now the restrictions macro will never change Oneone
or Alltwo
if that causes a problem for Allone
, and it will never change Alltwo
if that causes a problem for Oneone
. Previously, the %MktEx macro was getting stuck in some rows because it could never figure out how to fix one component of badness without making another component worse. For some problems, figuring out how to differentially weight the components of badness so that they never trade off against each other is the key to writing a successful restrictions macro. It often does not matter which component gets the most weight. What is important is that each component gets a different weight so that the %MktEx macro does not get caught cycling back and forth, making A better and B worse and then making B better and A worse. The following is some of the output from the first pass through the design while using the modified restrictions macro:
Algorithm Search History Current Best Design Row,Col D-Efficiency D-Efficiency Notes ---------------------------------------------------------- 1 Start 59.7632 Ran,Mut,Ann 1 1 60.2334 4 Violations 1 2 60.7867 3 Violations 1 3 61.1763 0 Violations 1 4 61.4382 0 Violations 1 5 61.6494 0 Violations 1 6 62.2171 0 Violations 1 7 62.1774 0 Violations 1 8 62.6273 0 Violations 1 9 63.1095 3 Violations 1 10 63.4078 0 Violations . . . 1 131 67.4221 0 Violations 1 132 66.9651 6 Violations 1 133 66.7982 0 Violations 1 134 66.4926 0 Violations 1 135 66.4555 0 Violations
In the first pass, the %MktEx macro is imposing all restrictions for most, but not all, of the rows. Some of the output from the second pass through the design is as follows:
1 1 66.5957 3 Violations 1 1 66.5711 0 Violations 1 2 66.6860 0 Violations 1 3 66.7663 0 Violations 1 4 66.8973 0 Violations 1 5 67.0152 0 Violations 1 6 67.0788 0 Violations 1 7 67.2155 0 Violations 1 8 67.2721 0 Violations 1 9 67.4274 0 Violations 1 10 67.5745 0 Violations . . . 1 65 71.3709 3 Violations 1 65 71.3636 3 Violations 1 65 71.3709 3 Violations 1 65 71.3709 3 Violations 1 65 71.3236 0 Violations . . . 1 131 72.9631 0 Violations 1 132 72.8559 4 Violations 1 132 72.8463 3 Violations 1 132 72.7370 0 Violations 1 133 72.7681 0 Violations 1 134 72.7280 0 Violations 1 135 72.7453 0 Violations
In the second pass, the %MktEx macro imposes all the restrictions in all rows that still had violations after the first pass. The third pass ends with the following output:
1 130 74.6889 0 Violations 1 131 74.7175 0 Violations 1 132 74.7384 0 Violations 1 133 1 74.7384 74.7384 Conforms 1 133 44 74.7571 74.7571
The %MktEx macro completes a full pass through row 132, the place of the last violation. It does not find any new violations, so in row 133 it states that the design conforms to the restrictions, and the iteration history proceeds in the normal fashion from then on. The note Conforms
is displayed at the place where the %MktEx macro decides that the design conforms. The design continues to conform throughout more iterations, even though the note Conforms
is not displayed on every line. The final efficiency of the design is as follows:
Average Prediction Design Standard Number D-Efficiency A-Efficiency G-Efficiency Error ------------------------------------------------------------------------ 1 80.2237 61.0557 93.7160 0.8650
The following statements create the choice design and display a subset of the design:
%mktkey(x1-x50) data key; input (x1-x25) ($); datalines; x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20 x21 x22 x45 x46 x47 x23 x24 x25 x26 x27 x28 x29 x30 x31 x32 x33 x34 x35 x36 x37 x38 x39 x40 x41 x42 x43 x44 x48 x49 x50 ; %mktroll(design=design, key=key, out=chdes) proc print data=chdes; by set; id set; where set le 2 or set ge 134; run;
Notice the slightly unusual arrangement of the Key
data set: the first 22 attributes are made from the first 44 factors of the linear arrangement, and the last 3 attributes are made from the last 6 factors of the linear arrangement.
A sample of four choice sets is shown in Figure 2.
Figure 2: First Two and Last Two Choice Sets
The restrictions macro is defined and called in four distinct places by the %MktEx macro. First, the %MktEx macro calls the restrictions macro in a preliminary PROC IML step to check for errors in the syntax. Next, it is called in between calling PROC PLAN or PROC FACTEX and calling PROC OPTEX. Here, the restrictions macro is used to impose restrictions on the candidate set. Next, the restrictions macro is used during design creation and by the coordinate-exchange algorithm. Finally, when you specify OPTIONS=ACCEPT, which means that restriction violations are acceptable, the restrictions macro is called after all the iterations have completed and after the %MktEx macro reports any restrictions violations in the final design. For some advanced restrictions, you might not want exactly the same code to run in all four places. When the restrictions are written purely in terms of restrictions on X
, which is the ith row of the design matrix, there is no problem. The same restrictions macro works fine for all uses. However, when (the full matrix), I
, or J1
(the row or column number) is used, the same code often cannot be used for all applications. The following are notes for each of the four phases.
In this phase, the macro is defined and called just to check for syntax errors. This step enables the %MktEx macro to end more gracefully when there are errors and to provide you with better information about the nature of the error than it would otherwise. Your restrictions macro can detect when it is in this phase, because the macro variable &main
is set to 0 and the macro variable &pass
is set to null. The &pass
variable is null before the iterations begin, 1 for the algorithm search phase, 2 for the design search phase, 3 for the design refinement stage, and 4 after the iterations end. You can conditionally execute code in this step by using the following macro statements:
%if &main eq 0 and &pass eq %then %do; /* execute in syntax check */ %if not (&main eq 0 and &pass eq) %then %do; /* not execute in syntax check */
You usually do not need to worry about this step. The %MktEx macro just calls the restrictions macro once and ignores the results to check for syntax errors. For this step, is a matrix of ones, is a vector of ones (because the design does not exist yet), and J1 = J2 = J3 = I = 1
. If you have complicated restrictions that involve the row or column exchange indices (I
, J1
, J2
, or J3
), you might need to worry about this step. You might need to either not execute your restrictions in this step or conditionally execute some assignment statements (just for this step) that set up J1
, J2
, and J3
more appropriately. Sometimes you can set things up appropriately by using the RESMAC= argument. However, be aware that this step checks I
, Try
, J1
, J2
, J3
, , and after your restrictions macro is called to ensure that you are not changing them; this is usually a sign of an error. If you get the following warning, make sure you are not incorrectly changing one of the matrices that you should not change:
WARNING: Restrictions macro is changing i, try, j1, j2, j3, x, or xmat. This might be a serious problem. Check your macro.
If the %MktEx macro detects a syntax error during this step, it will try to tell you where it is and what the problem is. If you have syntax errors in your restrictions macro and you cannot figure out what they are, sometimes the best option is to directly submit the statements in your restrictions macro to PROC IML to see the syntax errors. But first you need to submit the following statements:
%let n = 27; /* substitute number of runs */ %let m = 10; /* substitute number of factors */ proc iml; xmat = j(&n, &m, 1); i = 1; j1 = 1; j2 = 1; j3 = 1; bad = 0; x = xmat[i,];
In this phase, the restrictions macro is used to impose restrictions on the candidate set that is created by PROC PLAN or PROC FACTEX, but before it is searched by PROC OPTEX. The restrictions macro is called once for each row for which the column index J1
is set to 1. For some problems, such as most partial-profile problems, the restrictions are so severe that virtually none of the candidates can conform. Also, restrictions that are based on row number and column number do not make sense in the context of a candidate design. Your restrictions macro can detect when it is in this phase because the macro variable &main
is set to 0 and the macro variable &pass
is set to 1 or 2. You can conditionally execute code in this step by using the following macro statements:
%if &main eq 0 and &pass ge 1 and &pass le 2 %then %do; /* execute on candidates */ %if not (&main eq 0 and &pass ge 1 and &pass le 2) %then %do; /* not execute on candidates */
For simple restrictions that do not involve the column exchange indices (J1
, J2
, J3
), you probably do not need to worry about this step. If you use J1
, J2
, or J3
, you need to either not execute your restrictions in this step or conditionally execute some assignment statements that set up J1
, J2
, and J3
appropriately. Usually for this step, contains the candidate design, contains the ith row, J1
= 0, J2
= 0, J3
= 0, Try
= 1, and I
is set to the candidate row number.
In this phase, the restrictions macro is used to impose restrictions on the design as it is being built by the coordinate-exchange algorithm. Your restrictions macro can detect when it is in this phase because the macro variable &main
is set to 1 and the macro variable &pass
is set to 1, 2, or 3. You can conditionally execute code in this step by using the following macro statements:
%if &main eq 1 and &pass ge 1 and &pass le 3 %then %do; /* execute on coordinate exchange */ %if not (&main eq 1 and &pass ge 1 and &pass le 3) %then %do; /* not execute on coordinate exchange */
For this step, contains the candidate design; contains the ith row; J1
, J2
, and J3
usually contain the column indices; I
is the row number; and Try
is the zero-based cumulative row number. When you specify EXCHANGE=1, J1
exists; when you specify EXCHANGE=2, J1
and J2
exist; and so on. Sometimes in this phase, the restrictions macro is called once per row, with the J*
indices all set to 1. If you use the J*
indices in your restrictions, you might need to allow for this. For example, if you are checking the current J1
column for balance, and you used an INIT= data set with column one fixed and unbalanced, you do not want to perform the check when J1
= 1. For some designs that are partially initialized with an orthogonal array and for some uses of INIT=, not all columns or cells in the design are evaluated.
In this phase, the restrictions macro is used to check the design when there are restrictions and OPTIONS=ACCEPT. The restrictions macro is called once for each row of the design. Your restrictions macro can detect when it is in this phase because the macro variable &main
is set to 1 and the macro variable &pass
is greater than 3. You can conditionally execute code in this step by using the following macro statements:
%if &main eq 1 and &pass gt 3 %then %do; /* execute on final check */ %if not (&main eq 1 and &pass gt 3) %then %do; /* not execute on final check */
For this step, contains the candidate design, contains the ith row; J1
= 1, J2
= 1, J3
= 1, Try
= 1, and I
is the row number.
This example shows how to use the %MktEx macro to find a linear arrangement of a choice design. You can use this approach when you want all the attributes of all the alternatives to be balanced and orthogonal or at least nearly so.
The product is breakfast bars, and there are three brands: Branolicious, Brantopia, and Brantasia. The choice sets consist of the three brands and a constant (no purchase) alternative. Each brand has two attributes: a 4-level attribute for price and a 2-level attribute for the number of bars per box. The prices are $2.89, $2.99, $3.09, and $3.19, and the sizes are six-count and eight-count. You can make a choice design by starting with a design that is optimal for a hypothetical linear model that has factors for all the attributes of all the alternatives. The linear arrangement consists of the six factors, which the following tables show organized by brand and by attribute.
|
|
You need a factorial design that has six factors: Branolicious Price, Branolicious Count, Brantopia Price, Brantopia Count, Brantasia Price, and Brantasia Count. From this design, you make a choice design that has three attributes: brand, count, and price. You can use the %MktRuns macro as follows to suggest the number of choice sets:
%mktruns(4 2 4 2 4 2)
The input to the %MktRuns macro is the number of levels of all the factors (that is, all the attributes of all the alternatives). Figure 3 shows the output of the %MktRuns macro.
Figure 3: %MktRuns Output
Saturated = 13 Full Factorial = 512 Some Reasonable Cannot Be Design Sizes Violations Divided By 16 * 0 32 * 0 24 3 16 20 12 8 16 28 12 8 16 14 18 4 8 16 18 18 4 8 16 22 18 4 8 16 26 18 4 8 16 30 18 4 8 16 13 S 21 2 4 8 16 * - 100% Efficient design can be made with the MktEx macro. S - Saturated Design - The smallest design that can be made. Note that the saturated design is not one of the recommended designs for this problem. It is shown to provide some context for the recommended sizes. |
n | Design | Reference |
---|---|---|
16 | 2 ** 6 4 ** 3 | Fractional-Factorial |
16 | 2 ** 3 4 ** 4 | Fractional-Factorial |
32 | 2 ** 22 4 ** 3 | Fractional-Factorial |
32 | 2 ** 19 4 ** 4 | Fractional-Factorial |
32 | 2 ** 16 4 ** 5 | Fractional-Factorial |
32 | 2 ** 15 4 ** 3 8 ** 1 | Fractional-Factorial |
32 | 2 ** 13 4 ** 6 | Fractional-Factorial |
32 | 2 ** 12 4 ** 4 8 ** 1 | Fractional-Factorial |
32 | 2 ** 10 4 ** 7 | Fractional-Factorial |
32 | 2 ** 9 4 ** 5 8 ** 1 | Fractional-Factorial |
32 | 2 ** 7 4 ** 8 | Fractional-Factorial |
32 | 2 ** 6 4 ** 6 8 ** 1 | Fractional-Factorial |
32 | 2 ** 4 4 ** 9 | Fractional-Factorial |
32 | 2 ** 3 4 ** 7 8 ** 1 | Fractional-Factorial |
There are three 2-level factors and three 4-level factors. The saturated design has 13 runs or rows, so you need at least 13 choice sets for this approach. The full-factorial design has 512 runs, so there are a maximum of 512 possible choice sets. The %MktRuns macro suggests 16 as its first choice, because 16 meets the necessary but not sufficient conditions for the existence of an orthogonal array. The number 16 can be divided by 2 (you have 2-level factors), 4 (you have 4-level factors), (you have more than one 2-level factor), (you have more than one 4-level factor), and (you have both 2-level factors and 4-level factors). The number of choice sets must be divisible by all of these if the design is going to be orthogonal and balanced. The number 32 also meets these conditions. However, 16 is a more reasonable number of judgments for people to make, and all the other suggestions (24, 20, 28, 14, 18, 22, 26, 30) cannot be divided by at least one of the relevant numbers. For this example, the macro considers only sizes up to 32. By default, the %MktRuns macro stops considering larger sizes when it finds a perfect size (in this case 32) that is twice as big as another perfect size (16). Sixteen choice sets are ideal for this example. The necessary conditions are sufficient in this case, and there is an orthogonal array that you can use. The last part of the output lists the orthogonal arrays that the %MktEx macro knows how to make that also work for this specification.
The following statement invokes the %MktEx macro to find the factorial design:
%mktex(4 2 4 2 4 2, n=16, seed=17)
The %MktEx macro accepts a factor-level list like the %MktRuns macro list along with the number of runs or choice sets. You can specify a random number seed so that you always get the same design if you rerun the %MktEx macro.
Figure 4 shows the results.
Figure 4: %MktEx Output
The %MktEx macro found a 100% efficient, orthogonal, balanced design that has three 2-level factors and three 4-level factors. The levels are all positive integers, starting with 1 and continuing up to the number of levels.
The following statements display the design shown in Figure 5 by using PROC PRINT.
proc print data=Design; run;
Figure 5: %MktEx Output
This example creates a design for the same study as in the previous example but restricts the design to avoid choice sets where attributes are constant. That is, you want to select just the choice sets where neither price nor count is constant within a choice set.
Restrictions are written by using PROC IML statements that are embedded in a macro. You then provide the name of the restrictions macro to the %MktEx macro by using the RESTRICTIONS= argument. In the restrictions macro, you compute an IML scalar called Bad
that quantifies the badness of the design. The %MktEx macro automatically initializes the scalar Bad
to 0. In this case, because the restrictions are entirely within each choice set, you can just quantify the badness of one choice set at a time by evaluating the values in the scalars X1
–X6
, which correspond to the six attributes. If it is easier to use indexing to write the restrictions, you can use the vector , where = x1
, , = x6
instead. In the restrictions macro %Res that follows, Bad
is set to 1 if the Price
variable (which is made from x1
, x3
, and x5
) is constant. The scalar Bad
is incremented by 1 if the Count
variable (which is made from x2
, x4
, and x6
) is constant.
%macro res; if x1 = x3 & x1 = x5 then bad = 1; if x2 = x4 & x2 = x6 then bad = bad + 1; %mend;
You now use the %MktEx macro as follows to get a restricted factorial design for this problem:
%mktex(4 2 4 2 4 2, n=16, restrictions=res, seed=17)
Figure 6, Figure 7, Figure 8, and Figure 9 show the %MktEx macro’s output.
Figure 6: %MktEx Algorithm Search History
Current Best Design Row,Col D-Efficiency D-Efficiency Notes ---------------------------------------------------------- 1 Start 91.5504 Can 1 2 1 91.5504 91.5504 Conforms 1 End 91.5504 2 Start 100.0000 Tab 2 14 1 87.8163 Conforms 2 End 89.5326 3 Start 100.0000 Tab 3 14 1 85.2744 Conforms 3 End 86.9271 4 Start 100.0000 Tab 4 14 1 84.9718 Conforms 4 End 85.6568 5 Start 100.0000 Tab 5 14 1 84.7284 Conforms 5 End 87.3099 6 Start 100.0000 Tab 6 14 1 85.5377 Conforms 6 End 86.9731 7 Start 100.0000 Tab 7 14 1 82.8801 Conforms 7 End 88.0734 8 Start 100.0000 Tab 8 14 1 84.0164 Conforms 8 End 89.2343 9 Start 100.0000 Tab 9 14 1 82.9864 Conforms 9 End 87.2147 10 Start 100.0000 Tab 10 14 1 86.8052 Conforms 10 End 87.5513 11 Start 100.0000 Tab 11 14 1 85.0061 Conforms 11 End 88.7427 12 Start 61.3114 Ran,Mut,Ann 12 16 1 79.4843 Conforms 12 End 89.1152 13 Start 59.3474 Ran,Mut,Ann 13 9 1 88.1603 Conforms 13 End 90.8211 14 Start 5.9399 Ran,Mut,Ann 14 16 1 77.8656 Conforms 14 End 91.5287 15 Start 59.5266 Ran,Mut,Ann 15 13 1 85.6026 Conforms 15 End 87.1962 16 Start 60.1559 Ran,Mut,Ann 16 11 1 83.9377 Conforms 16 End 88.2425 17 Start 67.1759 Ran,Mut,Ann 17 15 1 85.3701 Conforms 17 End 90.6546 18 Start 62.9917 Ran,Mut,Ann 18 1 1 81.9340 Conforms 18 End 89.6519 19 Start 59.6925 Ran,Mut,Ann 19 15 1 85.4777 Conforms 19 End 90.9198 20 Start 60.8733 Ran,Mut,Ann 20 16 1 81.5568 Conforms 20 End 89.9440 21 Start 53.6255 Ran,Mut,Ann 21 16 1 81.3052 Conforms 21 End 89.5758 |
Figure 7: %MktEx Design Search History
Figure 8: %MktEx Design Refinement History
Current Best Design Row,Col D-Efficiency D-Efficiency Notes ---------------------------------------------------------- 0 Initial 91.5504 91.5504 Ini 1 Start 85.9351 Pre,Mut,Ann 1 2 1 89.2144 Conforms 1 11 1 91.5504 91.5504 1 15 3 91.5504 91.5504 1 11 5 91.5504 91.5504 1 16 1 91.5504 91.5504 1 2 1 91.5504 91.5504 1 End 91.5504 2 Start 88.9014 Pre,Mut,Ann 2 2 1 85.7076 Conforms 2 End 90.7905 3 Start 89.2953 Pre,Mut,Ann 3 2 1 88.6625 Conforms 3 End 89.7394 4 Start 91.5504 Pre,Mut,Ann 4 2 1 91.5504 91.5504 Conforms 4 End 90.8114 5 Start 87.5177 Pre,Mut,Ann 5 13 1 83.8041 Conforms 5 End 88.9568 6 Start 88.5059 Pre,Mut,Ann 6 2 1 91.5504 91.5504 Conforms 6 7 3 91.5504 91.5504 6 End 91.5504 7 Start 91.5504 Pre,Mut,Ann 7 2 1 90.0520 Conforms 7 11 1 91.5504 91.5504 7 End 91.5287 8 Start 92.1768 Pre,Mut,Ann 8 7 1 88.0433 Conforms 8 End 88.8908 9 Start 91.5504 Pre,Mut,Ann 9 2 1 91.5504 91.5504 Conforms 9 End 87.4793 |
Figure 9: %MktEx Output
The %MktEx macro succeeds in making the design conform to all restrictions. In all cases, it reports zero violations of the restrictions. In some cases, a design that has 100% D-efficiency is replaced by a design that has a lower D-efficiency as the restrictions are imposed. The final D-efficiency is 91.5504. Designs that have this same D-efficiency are repeatedly found, often indicating that the final design is an optimal design.
The following statements display the design shown in Figure 10 by using PROC PRINT.
proc print data=Design; run;
Figure 10: %MktEx Output