# The HPREG Procedure

### Example 60.3 Forward-Swap Selection

This example highlights the use of the forward-swap selection method, which is a generalization of the maximum R-square improvement (MAXR) method that is available in the REG procedure in SAS/STAT software. This example also demonstrates the use of the INCLUDE and START options.

The following DATA step produces the simulated data in which the response `y` depends on six main effects and three 2-way interactions from a set of 20 regressors.

```
data ex3Data;
array x{20};
do i=1 to 10000;
do j=1 to 20;
x{j} = ranuni(1);
end;
y = 3*x1 + 7*x2 -5*x3 + 5*x1*x3 +
4*x2*x13 + x7 + x11 -x13  + x1*x4 + rannor(1);
output;
end;
run;
```

Suppose you want to find the best model of each size in a range of sizes for predicting the response `y`. You can use the forward-swap selection method to produce good models of each size without the computational expense of examining all possible models of each size. In this example, the criterion used to evaluate the models of each size is the model R square. With this criterion, the forward-swap method coincides with the MAXR method that is available in the REG procedure in SAS/STAT software. The model of a given size for which no pairwise swap of an effect in the model with any candidate effect improves the R-square value is deemed to be the best model of that size.

Suppose that you have prior knowledge that the regressors `x1`, `x2`, and `x3` are needed in modeling the response `y`. Suppose that you also believe that some of the two-way interactions of these variables are likely to be important in predicting `y` and that some other two-way interactions might also be needed. You can use this prior information by specifying the selection process shown in the following statements:

```
proc hpreg data=ex3Data;
model y = x1|x2|x3|x4|x5|x6|x7|x8|x9|x10|X11|
x12|x13|x14|x5|x16|x7|x18|x19|x20@2 /
include=(x1 x2 x3) start=(x1*x2 x1*x3 x2*x3);
selection method=forwardswap(select=rsquare maxef=15 choose=sbc)
details=all;
run;
```

The MODEL statement specifies that all main effects and two-way interactions are candidates for selection. The INCLUDE= option specifies that the effects `x1`, `x2`, and `x3` must appear in all models that are examined. The START= option specifies that all the two-way interactions of these variables should be used in the initial model that is considered but that these interactions are eligible for removal during the forward-swap selection.

The "Selection Summary" table is shown in Output 60.3.1.

Output 60.3.1: Selection Summary

The HPREG Procedure

Selection Summary
Step Effect
Entered
Effect
Removed
Number
Effects In
SBC Model
R-Square
0 Intercept   1
x1   2
x2   3
x1*x2   4
x3   5
x1*x3   6
x2*x3   7 3307.6836 0.8837
1 x2*x13   8 1892.8403 0.8992
2 x7*x11 x1*x2 8 618.9298 0.9112
3 x1*x4 x2*x3 8 405.3751 0.9131
4 x13   9 213.6140 0.9148
5 x7   10 180.4457 0.9152
6 x11 x7*x11 10 1.4039* 0.9167
7 x10*x11   11 2.3393 0.9168
8 x3*x7   12 4.5000 0.9168
9 x6*x7   13 10.0589 0.9169
10 x3*x6   14 13.1113 0.9169
11 x5*x20   15 19.4612 0.9169
12 x13*x20 x3*x6 15 18.3678 0.9169
13 x5*x5 x6*x7 15 12.1398 0.9170*

 * Optimal Value of Criterion

You see that starting from the model with an intercept and the effects specified in the INCLUDE= and START= options at step 0, the forward-swap selection method adds the effect `x2*x13` at step one, because this yields the maximum improvement in R square that can be obtained by adding a single effect. The forward-swap selection method now evaluates whether any effect swap yields a better eight-effect model (one with a higher R-square value). Because you specified the DETAILS=ALL option in the SELECTION statement, at each step where a swap is made you obtain a "Candidates" table that shows the R-square values for the evaluated swaps. Output 60.3.2 shows the "Candidates" for step 2. By default, only the best 10 swaps are displayed.

Output 60.3.2: Swap Candidates at Step 2

Best 10 Candidates
Rank Effect
Dropped
Effect
R-Square
1 x1*x2 x7*x11 0.9112
2 x2*x3 x7*x11 0.9112
3 x1*x2 x7 0.9065
4 x2*x3 x7 0.9065
5 x1*x2 x7*x7 0.9060
6 x2*x3 x7*x7 0.9060
7 x1*x2 x4*x7 0.9060
8 x2*x3 x4*x7 0.9060
9 x1*x2 x11 0.9058
10 x2*x3 x11 0.9058

You see that the best swap adds `x7*x11` and drops `x1*x2`. This yields an eight-effect model whose R-square value (0.9112) is larger than the R-square value (0.8992) of the eight-effect model at step 1. Hence this swap is made at step 2. At step 3, an even better eight-effect model than the model at step 2 is obtained by dropping `x2*x3` and adding `x1*x4`. No additional swap improves the R-square value, and so the model at step 3 is deemed to be the best eight-effect model. Although this is the best eight-effect model that can be found by this method given the starting model, it is not guaranteed that this model that has the highest R-square value among all possible models that consist of seven effects and an intercept.

Because the DETAILS=ALL option is specified in the SELECTION statement, details for the model at each step of the selection process are displayed. Output 60.3.3 provides details of the model at step 3.

Output 60.3.3: Model Details at Step 3

Analysis of Variance
Source DF Sum of
Squares
Mean
Square
F Value Pr > F
Model 7 108630 15519 15000.3 <.0001
Error 9992 10337 1.03455
Corrected Total 9999 118967

 Root MSE 1.01713 0.91311 0.91305 10350 10350 405.375 1.03373

Parameter Estimates
Parameter DF Estimate Standard
Error
t Value Pr > |t|
Intercept 1 0.012095 0.045712 0.26 0.7913
x1 1 3.087078 0.076390 40.41 <.0001
x2 1 7.775180 0.046815 166.08 <.0001
x3 1 -4.957140 0.070995 -69.82 <.0001
x1*x3 1 4.910115 0.122503 40.08 <.0001
x1*x4 1 0.890436 0.060523 14.71 <.0001
x7*x11 1 1.708469 0.045939 37.19 <.0001
x2*x13 1 2.584078 0.061506 42.01 <.0001

The forward-swap method continues to find the best nine-effect model, best 10-effect model, and so on until it obtains the best 15-effect model. At this point the selection terminates because you specified the MAXEF=15 option in the SELECTION statement. The R-square value increases at each step of the selection process. However, because you specified the CHOOSE=SBC criterion in the SELECTION statement, the final model selected is the model at step 6.