Example 14.6 Bayesian Optimal Design

Note: See Bayesian Optimal Design in the SAS/QC Sample Library.

Suppose you want a design in 20 runs for seven two-level factors. There are 29 terms in a full second-order model, so you will not be able to estimate all main effects and two-factor interactions. If the number of runs were a power of 2, a design of resolution 4 could be used to estimate all main effects free of the two-factor interactions, as well as to provide partial information on the interactions. However, when the number of runs is not a power of two, as in this case, DuMouchel and Jones (1994) suggest searching for a Bayesian optimal design by specifying nonzero prior precision values for the interactions. You can specify these values in the OPTEX procedure with the PRIOR= option in the MODEL statement. This says that you want to consider all main effects and interactions as potential effects, but you are willing to sacrifice information on the interactions to obtain maximal information on the main effects. When an orthogonal design of resolution 4 exists, it is optimal according to this Bayesian criterion. You can use the following statements to generate the Bayesian D-optimal design:

proc factex;
   factors x1-x7;
   output out=Candidates;
run;

proc optex data=Candidates seed=57922 coding=orth;
   model x1-x7,
         x1|x2|x3|x4|x5|x6|x7@2 / prior=0,16;
   generate n=20 method=m_fedorov;
   output out=Design;
run;

With orthogonal coding, the value of the prior for an effect says roughly how many prior observations’ worth of information you have for that effect. In this case, the PRIOR= precision values and the use of commas to group effects in the MODEL statement says that there is no prior information for the main effects and 16 runs’ worth of information for each two-factor interaction. See the section Design Coding for details on orthogonal coding.

The efficiencies are shown in Output 14.6.1.

Output 14.6.1: Efficiencies for Bayesian Optimal Designs

The OPTEX Procedure

Design Number D-Efficiency A-Efficiency G-Efficiency Average Prediction
Standard Error
1 85.1815 74.6705 85.2579 1.1476
2 85.1815 74.6705 85.2579 1.1476
3 85.1815 74.6705 85.2579 1.1476
4 85.0424 73.3109 81.0800 1.1582
5 85.0424 73.3109 81.0800 1.1582
6 84.5680 73.5053 84.1376 1.1566
7 84.4931 72.1671 81.7855 1.1673
8 84.4239 72.4979 81.7431 1.1646
9 84.3919 74.6097 89.3631 1.1480
10 84.3919 74.6097 89.3631 1.1480


Notice that the best design was found in three tries out of ten. It might be a good idea to repeat the search with more tries (see the ITER= option). You can use the ALIASING option of the GLM procedure to list the aliasing structure for the design:

data Design; set Design;
   y = ranuni(654231);
proc glm data=Design;
   model y = x1-x7 x1|x2|x3|x4|x5|x6|x7@2 / e aliasing;
run;

The relevant part of the output is shown in Output 14.6.2. Most of the main effects are indeed unconfounded with two-factor interactions, although many two-factor interactions are confounded with each other.

Output 14.6.2: Aliasing Structure for Bayesian Optimal Design

The GLM Procedure

General Form of Aliasing Structure
Intercept
x1 - 0.5*x3*x7
x2
x3
x4 + 0.5*x3*x7
x5
x6
x7
x1*x2 - x3*x6 + 0.5*x3*x7 - x4*x7
x1*x3 - x2*x6 - x5*x7
x2*x3 + x3*x7
x1*x4 - x5*x6 + x5*x7 + x6*x7
x2*x4 - x3*x6 + 0.5*x3*x7 - x4*x7
x3*x4 - x2*x6 - x5*x7
x1*x5 - x4*x6 - x3*x7
x2*x5 + x2*x6 + x5*x7 + x6*x7
x3*x5 + x3*x6 - x3*x7
x4*x5 - x1*x6 - x3*x7
x1*x7 - x4*x7
x2*x7 + x5*x7 + x6*x7