Note: See Bayesian Optimal Design in the SAS/QC Sample Library.
Suppose you want a design in 20 runs for seven two-level factors. There are 29 terms in a full second-order model, so you will not be able to estimate all main effects and two-factor interactions. If the number of runs were a power of 2, a design of resolution 4 could be used to estimate all main effects free of the two-factor interactions, as well as to provide partial information on the interactions. However, when the number of runs is not a power of two, as in this case, DuMouchel and Jones (1994) suggest searching for a Bayesian optimal design by specifying nonzero prior precision values for the interactions. You can specify these values in the OPTEX procedure with the PRIOR= option in the MODEL statement. This says that you want to consider all main effects and interactions as potential effects, but you are willing to sacrifice information on the interactions to obtain maximal information on the main effects. When an orthogonal design of resolution 4 exists, it is optimal according to this Bayesian criterion. You can use the following statements to generate the Bayesian D-optimal design:
proc factex; factors x1-x7; output out=Candidates; run; proc optex data=Candidates seed=57922 coding=orth; model x1-x7, x1|x2|x3|x4|x5|x6|x7@2 / prior=0,16; generate n=20 method=m_fedorov; output out=Design; run;
With orthogonal coding, the value of the prior for an effect says roughly how many prior “observations’ worth” of information you have for that effect. In this case, the PRIOR= precision values and the use of commas to group effects in the MODEL statement says that there is no prior information for the main effects and 16 runs’ worth of information for each two-factor interaction. See the section Design Coding for details on orthogonal coding.
The efficiencies are shown in Output 14.6.1.
Output 14.6.1: Efficiencies for Bayesian Optimal Designs
Design Number | D-Efficiency | A-Efficiency | G-Efficiency | Average Prediction Standard Error |
---|---|---|---|---|
1 | 85.1815 | 74.6705 | 85.2579 | 1.1476 |
2 | 85.1815 | 74.6705 | 85.2579 | 1.1476 |
3 | 85.1815 | 74.6705 | 85.2579 | 1.1476 |
4 | 85.0424 | 73.3109 | 81.0800 | 1.1582 |
5 | 85.0424 | 73.3109 | 81.0800 | 1.1582 |
6 | 84.5680 | 73.5053 | 84.1376 | 1.1566 |
7 | 84.4931 | 72.1671 | 81.7855 | 1.1673 |
8 | 84.4239 | 72.4979 | 81.7431 | 1.1646 |
9 | 84.3919 | 74.6097 | 89.3631 | 1.1480 |
10 | 84.3919 | 74.6097 | 89.3631 | 1.1480 |
Notice that the best design was found in three tries out of ten. It might be a good idea to repeat the search with more tries (see the ITER= option). You can use the ALIASING option of the GLM procedure to list the aliasing structure for the design:
data Design; set Design; y = ranuni(654231); proc glm data=Design; model y = x1-x7 x1|x2|x3|x4|x5|x6|x7@2 / e aliasing; run;
The relevant part of the output is shown in Output 14.6.2. Most of the main effects are indeed unconfounded with two-factor interactions, although many two-factor interactions are confounded with each other.
Output 14.6.2: Aliasing Structure for Bayesian Optimal Design
General Form of Aliasing Structure |
---|
Intercept |
x1 - 0.5*x3*x7 |
x2 |
x3 |
x4 + 0.5*x3*x7 |
x5 |
x6 |
x7 |
x1*x2 - x3*x6 + 0.5*x3*x7 - x4*x7 |
x1*x3 - x2*x6 - x5*x7 |
x2*x3 + x3*x7 |
x1*x4 - x5*x6 + x5*x7 + x6*x7 |
x2*x4 - x3*x6 + 0.5*x3*x7 - x4*x7 |
x3*x4 - x2*x6 - x5*x7 |
x1*x5 - x4*x6 - x3*x7 |
x2*x5 + x2*x6 + x5*x7 + x6*x7 |
x3*x5 + x3*x6 - x3*x7 |
x4*x5 - x1*x6 - x3*x7 |
x1*x7 - x4*x7 |
x2*x7 + x5*x7 + x6*x7 |