Usage Note 23217: If I know the model, how can I create the corresponding design matrix (dummy, indicator, or design variables) in a data set?
If you know the model, there are three methods for creating a data set that contains the design matrix:
-
You can use an ODS OUTPUT statement with PROC GLMMOD to create a data set from the design matrix that is used in PROC GLM. Specify the model in the MODEL statement and identify any categorical predictors in the CLASS statement. Note that PROC GLMMOD only offers indicator (or dummy) coding of categorical predictor variables. For example, the GLM statements below fit the indicated model and the GLMMOD statements that follow create a data set from the same design matrix that was used in PROC GLM. Use the ODS LISTING statements if you want to suppress display of the GLMMOD output in the Output window.
proc glm data=a;
class a b c;
model y=a b c a*b;
run;
ods output designpoints=xmatrix;
ods listing close;
proc glmmod data=a;
class a b c;
model y=a b c a*b;
run;
ods listing;
-
Beginning with SAS 9, you can also use the OUTDESIGN= and OUTDESIGNONLY options in PROC LOGISTIC. PROC LOGISTIC can create design variables by using any of several different coding methods (parameterizations) including indicator (dummy) coding, effects coding, polynomial coding, and others. Specify the model in the MODEL statement and identify any categorical predictors in the CLASS statement. Use the PARAM= option in the CLASS statement to select the coding method. The following statements create a data set from the same design matrix as produced above by PROC GLMMOD (and internally by PROC GLM):
proc logistic data=a outdesign=xmatrix outdesignonly;
class a b c / param=glm;
model y=a b c a*b;
run;
But you can also use other coding methods. For example, these statements use effects coding for the categorical (CLASS) variables:
proc logistic data=a outdesign=xmatrix outdesignonly;
class a b c / param=effect;
model y=a b c a*b;
run;
See the LOGISTIC documentation for information about the various coding methods that are available.
- You can also use PROC TRANSREG. It offers both indicator and effects coding methods. Specify any categorical variables in the CLASS expansion. Use the ZERO= option to select a reference category or, as below, ZERO=NONE to obtain the less-than-full-rank coding that is used by PROC GLM. Specify any continuous predictors, the response, and any other variables that you want transferred to the output data set in the ID statement. For example, the following statements create a data set from the same design matrix as produced above by PROC GLMMOD (and internally by PROC GLM):
proc transreg data=a design;
model class(a b c a*b / zero=none);
id y;
output out=xmatrix;
run;
Effects coding can be done as follows:
proc transreg data=a design;
model class(a b c a*b / effects);
id y;
output out=xmatrix;
run;
Note that PROC TRANSREG automatically creates a macro variable, &_trgind, which contains a list of variable names that it creates. You can use this macro variable in subsequent procedures to refer to the full model.
Operating System and Release Information
*
For software releases that are not yet generally available, the Fixed
Release is the software release in which the problem is planned to be
fixed.
| Type: | Usage Note |
| Priority: | low |
| Topic: | SAS Reference ==> Procedures ==> GLMMOD SAS Reference ==> Procedures ==> TRANSREG Analytics ==> Regression Analytics ==> Analysis of Variance Analytics ==> Categorical Data Analysis SAS Reference ==> Procedures ==> LOGISTIC SAS Reference ==> Procedures ==> GLM
|
| Date Modified: | 2005-11-14 11:17:10 |
| Date Created: | 2003-04-09 14:46:05 |