Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The LOGISTIC Procedure

Generalized Logit Model

Formulation of the generalized logit models for nominal response variables can be found in Agresti (1990). Let Y be the response variable with categories 1, ... , r. Let x = (x0, x1, ... , xp)' be a (p+1) vector of covariates, with x_0\equiv 1. By choosing k as the reference category, the jth logit is given by
\log\biggl[\frac{{\rm Pr}({\rm Y}=j|{x})}{{\rm Pr}({\rm Y}=k|{x})}\biggl]=x' {\beta}_j 0\le j \le r, j\neq k
where {\beta}_j is a (p+1) vector of the regression coefficients for the jth logit.

For a sample of n subjects \{(y_i,x_i), 1\le i \le n\}, the log likelihood for the generalized logit model is

l=l({\beta}_j, 1\le j \le r,j\neq k)=\sum_{i=1}^n \log({\rm Pr}({\rm Y}=y_i|{x}_i))
and the log likelihood is maximized with respect to {\beta}_j, 1\le j \le r,j\neq k.

Newton's Method of Parameter Estimation

For the convenience of notation, consider the last response level to be the reference level. The response probabilities \pi_1, ... ,\pi_{r} are given by
\pi_r \equiv {\rm Pr}({\rm Y}=r|{x}) &=& \frac{1}{1+\sum_{l=1}^{r-1} {\rm e}^... ...}({\rm Y}=j|{x}) &=& \pi_r {\rm e}^{x' {\scriptstyle \beta}_j} 1 \le j \le r-1

For a single response vector y \equiv (y_1, ... , y_r)' with yj = 1 if Y=j and 0 otherwise, the log likelihood is

l(\pi_1, ... ,\pi_r; y)=\sum_{j=1}^r y_j\log(\pi_j)

Note that \sum_{j=1}^ry_j=1 and \sum_{j=1}^r\pi_j=1, so that

\frac{\partial l(\pi_1, ... ,\pi_r; y)}{\partial \pi_j}=\frac{y_j}{\pi_j} - \frac{y_r}{\pi_r} , 1\le j \le r

The first partial derivatives of \pi_1, ... ,\pi_{r} with respect to {\beta}_1, ... ,{\beta}_{r-1} are

\frac{\partial \pi_j}{\partial {\beta}_l} &=& \{ -\pi_j \pi_l x 1\le j \neq ... ...frac{\partial \pi_r}{\partial {\beta}_l} &=& - \pi_r \pi_l x 1\le l \le r-1
Denote {\pi}=(\pi_1, ... ,\pi_{r-1})' and {\pi}^*=(\pi_1, ... ,\pi_{r})'. Then,
 D\equiv \frac{\partial(\pi_1, ... ,\pi_r)}{\partial({\beta}_1, ... , {\beta}_... ... & {\rm diag}({{\pi}}) - {{\pi}} {{\pi}'} \cr & -\pi_r {{\pi}'} \cr} \otimes x'

The estimating equations become

D'W(y - {\pi}^*)=0 {\rm where} W={\rm diag}(\pi_1^{-1}, ... ,\pi_r^{-1})
Let
H=D' W{D}=({\rm diag}({\pi}) - {\pi}{\pi}') \otimes x{x}'

g=D' W(y - {\pi}^*)=(y_1-\pi_1, ... ,y_{r-1}-\pi_{r-1})' \otimes {x}

Let {\beta}=({\beta}_1', ... ,{\beta}_{r-1}')'. The estimate of {\beta} is obtained iteratively as follows:

{\beta}^{l+1}={\beta}^l + (\sum_i {w_iH}_i)^{-1} \sum_i w_i{g}_i
where i indexes the observations and wi is the product of the corresponding weight and frequency.

Confidence Limits for the Predicted Probabilities

By the delta method,
\sigma^2(\hat{\pi_j})=\biggl( \frac{\partial \pi_j} {\partial {\beta}} \biggr )' V(\hat{{\beta}}) \frac{\partial \pi_j}{\partial {\beta}}
where
\biggl (\frac{\partial \pi_j}{\partial {\beta}} \biggr)'=\biggl [ \biggl(\fra... ...)', ... , \biggl(\frac{\partial \pi_j}{\partial {\beta}_{r-1}}\biggr )'\biggr]

A 100(1-\alpha)% confidence level for \pi_j is given by

\hat{\pi}_j +- z_{1-\alpha/2} \hat{\sigma}(\hat{\pi_j})
where \hat{\sigma}(\hat{\pi_j}) is obtained by replacing {\beta} by \hat{{\beta}} in \sigma(\hat{\pi_j}).

No Intercept Model

When the NOINT option is specified with the LINK=GLOGIT option, all intercepts are suppressed. This differs from the cumulative model where only the first intercept is suppressed when the NOINT option is specified.

Fit Statistics

Suppose there are r response categories and s covariates (each dummy variable being counted as a separate covariate). The number of parameters estimated is p=(r-1)(1+s). Let L be the likelihood.

Akaike Information Criterion:

 AIC=-2\log(L) + 2p

Schwartz Criterion:

SC=-2\log(L) + p \log(\sum_j f_j)
where fj is the frequency of the jth observation.

Exact Conditional Analysis

When an EXACT statement and the LINK=GLOGIT option is specified, the generalized logit model is fit as described in Hirji (1992). If there are only two response levels, the binary logit model is fit instead. Hypothesis tests for each effect are computed across logit functions, but individual parameters are estimated for each logit function.

Association of Observed Values and Predicted Probabilities

When the LINK=GLOGIT option is specified, the ``Association of Observed Values and Predicted Probabilities'' table is suppressed unless there are only two response levels; in that case, the generalized logit model is reduced to the binary logit model.

Printing and Outputting Parameter Estimates

Each logit function has a set of parameters for the intercept and covariates. Instead of printing and outputting the parameter estimates by logit function, PROC LOGISTIC presents all parameter estimates for the intercept first, followed by all estimates of the first covariate, etc.

Since each logit function contrasts a nonreference response category with the reference category, the "Analysis of Maximum Likelihood Estimates" table includes the response variable column whose values are used to identify the corresponding logit function.

For the OUTEST= data set, names of parameters corresponding to the nonreference category `xxx' contain _xxx as the suffix. For example, suppose the variable Net3 represents the television network viewed at a certain time, with values `ABC', `CBS', and `NBC'. The following code fits a generalized logit model with Age and Gender (a CLASS variable with values Female and Male) as explanatory variables.

   proc logistic;
      class Gender;
      model Net3 = Age Gender / link=glogit;
   run;
Since `NBC' is the last value in the sorted order of the response categories, it corresponds to the default reference category. There are two logit functions, one contrasting `ABC' with `NBC' and the other contrasting `CBS' with `NBC'. For each logit, there are three parameters: an intercept parameter, a slope parameter for Age, and a slope parameter for Gender (since there are only two gender levels and the EFFECT parameterization is used by default). The names of the parameters and their descriptions are as follows.
Parameter Description
Intercept_ABCintercept parameter for the logit contrasting `ABC'
 with `NBC'
Intercept_CBSintercept parameter for the logit contrasting `CBS'
 with `NBC'
Age_ABCAge parameter for the logit contrasting `ABC' with
 `NBC'
Age_CBSAge parameter for the logit contrasting `CBS' with
 `NBC'
GenderFemale_ABCGender=Female parameter for the logit contrasting `ABC'
 with `NBC'
GenderFemale_CBSGender=Female parameter for the logit contrasting `CBS'
 with `NBC'

Out= Output Data Set

If any of the XBETA=, STDXBETA=, PREDICTED=, LOWER=, and UPPER= options are specified in the OUTPUT statement when there are more than two response categories, each input observation generates as many output observations as the number of response categories. The predicted probabilities and their confidence limits correspond to the probabilities of individual response categories rather than the cumulative probabilities as in the case of fitting a cumulative model. Regression diagnostics are suppressed when there are more than two response categories. You can specify PREDPROB=(I C) to obtain the predicted probabilities of individual response categories as well as the predicted cumulative probabilities.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 2001 by SAS Institute Inc., Cary, NC, USA. All rights reserved.