SUPPORT / SAMPLES & SAS NOTES
 

Support

Usage Note 22593: Signs of the logistic or probit model parameter estimates seem backward

DetailsAboutRate It

In a logistic or probit model for a binary response, the signs of the parameter estimates might seem to be backward if you model the probability of the wrong response level. The LOGISTIC, PROBIT, GENMOD, GLIMMIX, GAM, and GAMPL procedures document the level it models (for binary responses), the order of levels (for ordinal responses), or the reference level (for nominal responses) in a NOTE in the SAS Log and below the Response Profile table in the displayed results. For binary and ordinal response models, switching the level modeled or the level ordering is reflected in the model by the parameter estimates switching signs. It also causes odds ratio estimates in a logistic model to be inverted as described in this note.

You can specify options following your response variable in the MODEL statement to explicitly set the modeled response level, order of levels, or reference level. You should always use these options to ensure that the procedure fits the model you want. In the following, these options are discussed using PROC LOGISTIC, but they can be used in the same way in the other procedures mentioned above.

Binary response models
You should explicitly specify the response level you want to model by using the EVENT= response option. For example, if your response variable, Y, has values 0 and 1 in your data and you want to model the probability of level 1, then specify:
     proc logistic;
        model y(event='1') = <your model effects>;
        run;
If your response variable has a format associated with it, specify the formatted value of the desired level in the EVENT= option.
Ordered multinomial (ordinal) response models
Correct results depend on the preserving the natural ordering of your response variable's levels — either monotonically increasing or decreasing. Use the DESCENDING and/or ORDER= response option to set the response level ordering. Be sure to examine the Response Profile table to verify that the order of the response levels shown is strictly increasing or decreasing. In the Response Profile table, the response levels are associated with Ordered Values 1, 2, 3, ... . PROC LOGISTIC always models the probabilities of response levels having lower Ordered Values.

For example, suppose response Y has values "lo ", "med", "hi ". By default, LOGISTIC sorts the levels in ascending order and assigns Ordered Value 1 to the lowest level, Ordered Value 2 to the next lowest level, etc. resulting in the nonmonotonic ordering below:

     proc logistic;
        model y = x;
        run;

The parameter estimates from this analysis are meaningless since the natural order of the response was not preserved.

Response Profile
Ordered
Value
y Total
Frequency
1 hi 23
2 lo 25
3 med 25
 
Probabilities modeled are cumulated over the lower Ordered Values.
 
Analysis of Maximum Likelihood Estimates
Parameter   DF Estimate Standard
Error
Wald
Chi-Square
Pr > ChiSq
Intercept hi 1 -0.5980 0.6987 0.7325 0.3921
Intercept lo 1 0.8320 0.7023 1.4035 0.2361
x   1 -0.1170 0.4307 0.0738 0.7858

If the levels appear in proper ascending or descending order in your input data set (for example, if "lo " appears before "med" and "med" before "hi "), then you can use the ORDER=DATA option to establish a proper order:

     proc logistic;
        model y(order=data) = x;
        run;

In this case, LOGISTIC models the probabilities toward the "lo " end of the scale since "lo " is assigned the lowest Ordered Value. The positive parameter estimate for X (0.2441) means that the probability of lower Y values increases as X increases.

Response Profile
Ordered
Value
y Total
Frequency
1 lo 25
2 med 25
3 hi 23
 
Probabilities modeled are cumulated over the lower Ordered Values.
 
Analysis of Maximum Likelihood Estimates
Parameter   DF Estimate Standard
Error
Wald
Chi-Square
Pr > ChiSq
Intercept lo 1 -1.0260 0.7060 2.1120 0.1461
Intercept med 1 0.4080 0.6965 0.3431 0.5581
x   1 0.2441 0.4313 0.3204 0.5714

By including the DESCENDING response option, the ordering of the levels is reversed and the "hi " end is modeled:

     proc logistic;
        model y(order=data descending) = x;
        run;

Notice that the sign of the X parameter has now changed. The negative parameter estimate (-0.2441) means that the probability of higher Y values decreases as X increases. This is consistent with the analysis without the DESCENDING option, but just changes the focus from the probability of lower levels of Y to higher levels.

The odds ratio (not shown) also changes to be consistent. Without the DESCENDING option, the odds ratio for X is exp(0.2441) = 1.277. With the DESCENDING option, the odds ratios for X is exp(-0.2441) = 1/1.277 = 0.783.

Response Profile
Ordered
Value
y Total
Frequency
1 hi 23
2 med 25
3 lo 25
 
Probabilities modeled are cumulated over the lower Ordered Values.
 
Analysis of Maximum Likelihood Estimates
Parameter   DF Estimate Standard
Error
Wald
Chi-Square
Pr > ChiSq
Intercept hi 1 -0.4080 0.6965 0.3431 0.5581
Intercept med 1 1.0260 0.7060 2.1120 0.1461
x   1 -0.2441 0.4313 0.3204 0.5714
Unordered multinomial (nominal) response models
When you specify the LINK=GLOGIT option in the MODEL statement (available in the LOGISTIC and GLIMMIX procedures), the procedure fits a generalized logit model which is appropriate for a response with unordered levels. Always use the REF= response option to identify the response level that you want to be the reference level. For example, if nominal response Y has levels 1, 2, 3 and level 1 is considered the reference level, then specify:
     proc logistic;
        model y(ref='1') = <your model effects>;
        run;
This will cause the following generalized logits to be modeled:
log(p2/p1)
log(p3/p1)

Note that reference level 1 appears in the denominator of both logits.



Operating System and Release Information

Product FamilyProductSystemSAS Release
ReportedFixed*
SAS SystemSAS/STATAlln/a
* For software releases that are not yet generally available, the Fixed Release is the software release in which the problem is planned to be fixed.