CLASS Variable Parameterization and the SPLIT Option |
The GLMSELECT procedure supports nonsingular parameterizations for classification effects. A variety of these nonsingular parameterizations are available. You use the PARAM= option in the CLASS statement to specify the parameterization. See the section Other Parameterizations in Chapter 19, Shared Concepts and Topics, for details.
PROC GLMSELECT also supports the ability to split classification effects. You can use the SPLIT option in the CLASS statement to request that the columns of the design matrix that correspond to any effect that contains a split classification variable can be selected to enter or leave a model independently of the other design columns of that effect. The following statements illustrate the use of SPLIT option together with other features of the CLASS statement:
data codingExample; drop i; do i=1 to 1000; c1 = 1 + mod(i,6); if i < 50 then c2 = 'very low '; else if i < 250 then c2 = 'low'; else if i < 500 then c2 = 'medium'; else if i < 800 then c2 = 'high'; else c2 = 'very high'; x1 = ranuni(1); x2 = ranuni(1); y = x1 + 10*(c1=3) +5*(c1=5) +rannor(1); output; end; run; proc glmselect data=codingExample; class c1(param=ref split) c2(param=ordinal order=data) / delimiter = ',' showcoding; model y = c1 c2 x1 x2/orderselect; run;
The "Class Level Information" table shown in Figure 44.11 is produced by default whenever you specify a CLASS statement.
Class Level Information | |||
---|---|---|---|
Class | Levels | Values | |
c1 | 6 | * | 1,2,3,4,5,6 |
c2 | 5 | very low,low,medium,high,very high | |
* Associated Parameters Split |
Note that because the levels of the variable c2 contain embedded blanks, the DELIMITER= option has been specified. The SHOWCODING option requests the display of the "Class Level Coding" table shown in Figure 44.12. An ordinal parameterization is used for c2 because its levels have a natural order. Furthermore, because these levels appear in their natural order in the data, you can preserve this order by specifying the ORDER=DATA option.
Class Level Coding | |||||
---|---|---|---|---|---|
c1 Level |
Design Variables | ||||
1 | 2 | 3 | 4 | 5 | |
1 | 1 | 0 | 0 | 0 | 0 |
2 | 0 | 1 | 0 | 0 | 0 |
3 | 0 | 0 | 1 | 0 | 0 |
4 | 0 | 0 | 0 | 1 | 0 |
5 | 0 | 0 | 0 | 0 | 1 |
6 | 0 | 0 | 0 | 0 | 0 |
The SPLIT option has been specified for the classification variable c1. This permits the parameters associated with the effect c1 to enter or leave the model individually. The "Parameter Estimates" table in Figure 44.13 shows that for this example the parameters that correspond to only levels 3 and 5 of c1 are in the selected model. Finally, note that the ORDERSELECT option in the MODEL statement specifies that the parameters are displayed in the order in which they first entered the model.
Parameter Estimates | ||||
---|---|---|---|---|
Parameter | DF | Estimate | Standard Error | t Value |
Intercept | 1 | -0.216680 | 0.068650 | -3.16 |
c1_3 | 1 | 10.160900 | 0.087898 | 115.60 |
c1_5 | 1 | 5.018015 | 0.087885 | 57.10 |
x1 | 1 | 1.315468 | 0.109772 | 11.98 |