In modeling procedures, the CLASS statement is used to indicate which variables in the model are categorical variables. Such a variable is then treated as a nominal (unordered) categorical predictor variable. A set of numeric indicator (or "dummy") variables is created internally to represent the levels of the variable. Because the indicator variables are used for fitting the model, the original variable does not need to be numeric. The resulting model has multiple parameter estimates (one for each indicator variable). Each parameter compares one level of the predictor with a reference level, typically the last level in sorted order. A joint test of all the estimated parameters for the predictor is a test for any differences among the levels and is therefore a test of the predictor's overall effect.
In contrast, a variable name that appears in the MODEL statement but not in the CLASS statement is treated as a continuous predictor variable. The variable itself is used in fitting the model. Therefore, the variable must be a numeric SAS® variable and should be continuous or at least be ordered with assigned numeric scores. The resulting model typically has one parameter estimate (there might be more for models with multiple response variables or functions) that estimates the linear effect of the predictor.
Note that if the predictor was unordered, it would not be useful to test for its "linear" effect because you cannot talk about the effect of "increasing" an unordered variable. So, all nominal, categorical variables should be listed in the CLASS statement. On the other hand, you might choose to ignore the ordering in a continuous predictor variable and treat it as a nominal predictor by specifying it in the CLASS statement. But remember that a parameter will be added to the model for each additional level of the variable and this could result in a very large model if the variable has many distinct values in the data set.
Some procedures (such as the LOGISTIC, GENMOD and others) offer many options in the CLASS statement that enable you to designate how the internally generated variables are coded. See this note. Each coding method imposes a different interpretation on the estimated parameters. For instance, the PARAM=GLM and PARAM=REF options use the dummy coding method described above, which creates parameter estimates that compare the effect of each level to the effect of the reference level. The reference level can be specified with the REF= option as shown in this note. Another coding method for nominal predictors is effects coding, which results in parameter estimates that compare the effect of each level to the average effect of all the levels. There is a coding method appropriate for variables that are ordinal but with unknown spacing between the levels. And there is a coding method for continuous variables that decomposes the variable's effect into linear, quadratic, cubic, and other components.
You should avoid specifying variables in the CLASS statement that are not used in subsequent statements to define the model. Generally, observations that have missing values in any of the CLASS variables are omitted from the analysis. Therefore, if a variable appears in the CLASS statement but is unused in the model, then any observation which has a missing value on this variable will be ignored. So, specifying unused variables in the CLASS statement can result in more observations being omitted from the model fit than necessary.
For additional information, see the description of the CLASS statement in the chapter of the SAS/STAT® User's Guide for the procedure that you are using.
Product Family | Product | System | SAS Release | |
Reported | Fixed* | |||
SAS System | SAS/STAT | All | n/a |
Type: | Usage Note |
Priority: | low |
Topic: | Analytics ==> Survey Sampling and Analysis SAS Reference ==> Procedures ==> GLM SAS Reference ==> Procedures ==> GLIMMIX SAS Reference ==> Procedures ==> QUANTREG Analytics ==> Regression SAS Reference ==> Procedures ==> SURVEYLOGISTIC SAS Reference ==> Procedures ==> SURVEYREG SAS Reference ==> Procedures ==> PROBIT SAS Reference ==> Procedures ==> PLS SAS Reference ==> Procedures ==> MIXED SAS Reference ==> Procedures ==> ORTHOREG SAS Reference ==> Procedures ==> LOGISTIC SAS Reference ==> Procedures ==> LIFEREG SAS Reference ==> Procedures ==> ANOVA Analytics ==> Power and Sample Size Analytics ==> Multivariate Analysis SAS Reference ==> Procedures ==> GAM SAS Reference ==> Procedures ==> GENMOD Analytics ==> Longitudinal Analysis Analytics ==> Categorical Data Analysis Analytics ==> Analysis of Variance SAS Reference ==> Procedures ==> GLMMOD SAS Reference ==> Procedures ==> GLMPOWER Analytics ==> Survival Analysis SAS Reference ==> Procedures ==> PHREG SAS Reference ==> Procedures ==> SURVEYPHREG |
Date Modified: | 2019-05-06 16:05:46 |
Date Created: | 2003-03-19 10:25:44 |