The DIRECT statement lists numeric independent variables to be treated in a quantitative, rather than qualitative, way. The DIRECT statement is useful for logistic regression, which is described in the section Logistic Regression. For limitations of models involving continuous variables, see the section Continuous Variables.
Caution: If a DIRECT variable is formatted, then the unformatted (internal) values are used in the analysis and the formatted values are displayed. If you use a format to group the internal values into one formatted value, then the first internal value is used in the analysis. If specified, the DIRECT statement must precede the MODEL statement. For example:
proc catmod; direct X; model Y=X; run;
Suppose X
has five levels. Then the main effect X
adds only one column to the design matrix rather than four. The values inserted into the design matrix are the actual values
of X
.
You can interactively change the variables declared as DIRECT variables by using the statement without listing any variables. The following statements are valid:
proc catmod; direct X; model Y=X; weight wt; run; direct; model Y=X; run;
The first MODEL statement uses the actual values of X
, and the second MODEL statement uses the four variables created when PROC CATMOD generates the design matrix. Note that the
preceding statements can be run without a WEIGHT
statement if the input data are raw data rather than cell counts.
For more details, see the discussions of main and direct effects in the section Generation of the Design Matrix.