The LOGISTIC Procedure

Model Fitting Information

For the jth observation, let ${\widehat{\pi }}_ j$ be the estimated probability of the observed response. The three criteria displayed by the LOGISTIC procedure are calculated as follows:

–2 log likelihood:

$-2\mbox{ Log L}=-2\sum _ j \frac{w_ j}{\sigma ^2} f_ j\log ({\widehat{\pi }}_ j)$

where $w_ j$ and $f_ j$ are the weight and frequency values of the jth observation, and $\sigma ^2$ is the dispersion parameter, which equals 1 unless the SCALE= option is specified. For binary response models that use events/trials MODEL statement syntax, this is

$-2\mbox{ Log L}=-2\sum _ j \frac{w_ j}{\sigma ^2} f_ j [ \log {{n_ j}\choose {r_ j}} + r_ j \log ({\widehat{\pi }}_ j) + (n_ j-r_ j)\log (1-{\widehat{\pi }}_ j) ]$

where $r_ j$ is the number of events, $n_ j$ is the number of trials, ${\widehat{\pi }}_ j$ is the estimated event probability, and the statistic is reported both with and without the constant term.
Akaike’s information criterion:

$\mbox{AIC}=-2\mbox{ Log L}+2p$

where p is the number of parameters in the model. For cumulative response models, $p = k+s$ , where k is the total number of response levels minus one and s is the number of explanatory effects. For the generalized logit model, $p = k(s+1)$ .
Schwarz (Bayesian information) criterion:

$\mbox{SC}=-2\mbox{ Log L}+p\log (\sum _ jf_ jn_ j)$

where p is the number of parameters in the model, $n_ j$ is the number of trials when events/trials syntax is specified, and $n_ j=1$ with single-trial syntax.

The AIC and SC statistics give two different ways of adjusting the –2 Log L statistic for the number of terms in the model and the number of observations used. These statistics can be used when comparing different models for the same data (for example, when you use the SELECTION= STEPWISE option in the MODEL statement). The models being compared do not have to be nested; lower values of the statistics indicate a more desirable model.

The difference in the –2 Log L statistics between the intercepts-only model and the specified model has a $p-k$ degree-of-freedom chi-square distribution under the null hypothesis that all the explanatory effects in the model are zero, where p is the number of parameters in the specified model and k is the number of intercepts. The likelihood ratio test in the "Testing Global Null Hypothesis: BETA=0" table displays this difference and the associated p-value for this statistic. The score and Wald tests in that table test the same hypothesis and are asymptotically equivalent; for more information, see the sections Residual Chi-Square and Testing Linear Hypotheses about the Regression Coefficients.