The LOGISTIC Procedure

CONTRAST Statement

  • CONTRAST 'label' row-description<,, row-description> </ options>;

where a row-description is defined as follows:

effect values<, …, effect values>

The CONTRAST statement provides a mechanism for obtaining customized hypothesis tests. It is similar to the CONTRAST and ESTIMATE statements in other modeling procedures.

The CONTRAST statement enables you to specify a matrix, $\bL $, for testing the hypothesis $\bL \bbeta = \bm {0}$, where $\bbeta $ is the vector of intercept and slope parameters. You must be familiar with the details of the model parameterization that PROC LOGISTIC uses (for more information, see the PARAM= option in the section CLASS Statement). Optionally, the CONTRAST statement enables you to estimate each row, $ \bm {l}’_ i\bbeta $, of $\bL \bbeta $ and test the hypothesis $ \bm {l}_ i’\bbeta =0$. Computed statistics are based on the asymptotic chi-square distribution of the Wald statistic.

There is no limit to the number of CONTRAST statements that you can specify, but they must appear after the MODEL statement.

The following parameters are specified in the CONTRAST statement:

label

identifies the contrast in the displayed output. A label is required for every contrast specified, and it must be enclosed in quotation marks.

effect

identifies an effect that appears in the MODEL statement. The name INTERCEPT can be used as an effect when one or more intercepts are included in the model. You do not need to include all effects that are included in the MODEL statement.

values

are constants that are elements of the $\bL $ matrix associated with the effect. To correctly specify your contrast, it is crucial to know the ordering of parameters within each effect and the variable levels associated with any parameter. The "Class Level Information" table shows the ordering of levels within variables. The E option, described later in this section, enables you to verify the proper correspondence of values to parameters. If too many values are specified for an effect, the extra ones are ignored. If too few values are specified, the remaining ones are set to 0.

Multiple degree-of-freedom hypotheses can be tested by specifying multiple row-descriptions; the rows of $\bL $ are specified in order and are separated by commas. The degrees of freedom is the number of linearly independent constraints implied by the CONTRAST statement—that is, the rank of $\bL $.

More details for specifying contrasts involving effects with full-rank parameterizations are given in the section Full-Rank Parameterized Effects, while details for less-than-full-rank parameterized effects are given in the section Less-Than-Full-Rank Parameterized Effects.

You can specify the following options after a slash (/):

ALPHA=number

specifies the level of significance $\alpha $ for the $100(1-\alpha )$% confidence interval for each contrast when the ESTIMATE option is specified. The value of number must be between 0 and 1. By default, number is equal to the value of the ALPHA= option in the PROC LOGISTIC statement, or 0.05 if that option is not specified.

E

displays the $\bL $ matrix.

ESTIMATE=keyword

estimates and tests each individual contrast (that is, each row, $\bm {l}_ i’\bbeta $, of $\bL \bbeta $), exponentiated contrast (${e}^{\bm {l}_ i^\prime \bbeta }$), or predicted probability for the contrast ($g^{-1}(\bm {l}_ i’\bbeta )$). PROC LOGISTIC displays the point estimate, its standard error, a Wald confidence interval, and a Wald chi-square test. The significance level of the confidence interval is controlled by the ALPHA= option. You can estimate the individual contrast, the exponentiated contrast, or the predicted probability for the contrast by specifying one of the following keywords:

PARM

estimates the individual contrast.

EXP

estimates the exponentiated contrast.

BOTH

estimates both the individual contrast and the exponentiated contrast.

PROB

estimates the predicted probability of the contrast.

ALL

estimates the individual contrast, the exponentiated contrast, and the predicted probability of the contrast.

For more information about the computations of the standard errors and confidence limits, see the section Linear Predictor, Predicted Probability, and Confidence Limits.

SINGULAR=number

tunes the estimability check. This option is ignored when a full-rank parameterization is specified. If $\mb{v}$ is a vector, define $\mbox{ABS}(\mb{v})$ to be the largest absolute value of the elements of $\mb{v}$. For a row vector $\bm {l}^{\prime }$ of the contrast matrix $\bL $, define $c=\mbox{ABS}(\bm {l})$ if $\mbox{ABS}(\bm {l})$ is greater than 0; otherwise, c = 1. If $\mbox{ABS}(\bm {l}^{\prime } - \bm {l}^{\prime }\bT )$ is greater than $c*$number, then $\bm {l}$ is declared nonestimable. The $\bT $ matrix is the Hermite form matrix $\bI _0^{-}\bI _0$, where $\bI _0^{-}$ represents a generalized inverse of the (observed or expected) information matrix $\bI _0$ of the null model. The value for number must be between 0 and 1; the default value is 1E–4.

Full-Rank Parameterized Effects

If an effect involving a CLASS variable with a full-rank parameterization does not appear in the CONTRAST statement, then all of its coefficients in the $\bL $ matrix are set to 0.

If you use effect coding by default or by specifying PARAM= EFFECT in the CLASS statement, then all parameters are directly estimable and involve no other parameters. For example, suppose an effect-coded CLASS variable A has four levels. Then there are three parameters ($\beta _1, \beta _2, \beta _3$) representing the first three levels, and the fourth parameter is represented by

\[ -\beta _1 - \beta _2 - \beta _3 \]

To test the first versus the fourth level of A, you would test

\[ \beta _1 = - \beta _1 - \beta _2 - \beta _3 \]

or, equivalently,

\[ 2\beta _1 + \beta _2 + \beta _3 = 0 \]

which, in the form $\bL \bbeta = 0$, is

\[ \left[ \begin{array}{ccc} 2 & 1 & 1 \end{array} \right] \left[ \begin{array}{c} \beta _1 \\ \beta _2 \\ \beta _3 \end{array} \right] = 0 \]

Therefore, you would use the following CONTRAST statement:

contrast '1 vs. 4' A 2 1 1;

To contrast the third level with the average of the first two levels, you would test

\[ \frac{\beta _1 + \beta _2}{2} = \beta _3 \]

or, equivalently,

\[ \beta _1 + \beta _2 - 2\beta _3 = 0 \]

Therefore, you would use the following CONTRAST statement:

contrast '1&2 vs. 3' A 1 1 -2;

Other CONTRAST statements are constructed similarly. For example:

contrast '1 vs. 2    '  A  1 -1  0;
contrast '1&2 vs. 4  '  A  3  3  2;
contrast '1&2 vs. 3&4'  A  2  2  0;
contrast 'Main Effect'  A  1  0  0,
                        A  0  1  0,
                        A  0  0  1;

Less-Than-Full-Rank Parameterized Effects

When you use the less-than-full-rank parameterization (by specifying PARAM= GLM in the CLASS statement), each row is checked for estimability; see the section Estimable Functions in Chapter 3: Introduction to Statistical Modeling with SAS/STAT Software, for more information. If PROC LOGISTIC finds a contrast to be nonestimable, it displays missing values in corresponding rows in the results. PROC LOGISTIC handles missing level combinations of classification variables in the same manner as PROC GLM: parameters corresponding to missing level combinations are not included in the model. This convention can affect the way in which you specify the $\bL $ matrix in your CONTRAST statement. If the elements of $\bL $ are not specified for an effect that contains a specified effect, then the elements of the specified effect are distributed over the levels of the higher-order effect just as the GLM procedure does for its CONTRAST and ESTIMATE statements. For example, suppose that the model contains effects A and B and their interaction A*B. If you specify a CONTRAST statement involving A alone, the $\bL $ matrix contains nonzero terms for both A and A*B, because A*B contains A. For more information, see rule 4 in the section Construction of Least Squares Means in Chapter 46: The GLM Procedure.