Generalized Logit Model
Formulation of the generalized logit models for nominal response
variables can be found in Agresti (1990).
Let Y be the response variable with categories 1, ... , r.
Let x = (x0, x1, ... , xp)' be a (p+1) vector of
covariates, with . By choosing k as the reference category,
the jth logit is given by
where is a (p+1) vector of the regression coefficients for the
For a sample of n subjects , the
log likelihood for the generalized logit model is
and the log likelihood is maximized with respect to
Newton's Method of Parameter Estimation
For the convenience of notation, consider the last response level to be the
reference level. The response probabilities
are given by
For a single response vector y with yj = 1 if Y=j and 0 otherwise, the log likelihood
Note that and , so that
The first partial derivatives of with respect to
The estimating equations become
Let . The estimate of is
obtained iteratively as follows:
where i indexes the observations and wi is the product of the
corresponding weight and frequency.
Confidence Limits for the Predicted Probabilities
By the delta method,
A 100(1)% confidence level for is given by
where is obtained by replacing by
No Intercept Model
When the NOINT option is specified with the LINK=GLOGIT option, all intercepts
are suppressed. This differs from the cumulative model where only the first
intercept is suppressed when the NOINT option is specified.
Suppose there are r response categories and s covariates (each dummy
variable being counted as a separate covariate).
The number of parameters estimated
is p=(r-1)(1+s). Let L be the likelihood.
Akaike Information Criterion:
where fj is the frequency of the jth observation.
Exact Conditional Analysis
When an EXACT statement and the LINK=GLOGIT option is specified, the generalized
logit model is fit as described in Hirji (1992). If there are only two response
levels, the binary logit model is fit instead. Hypothesis tests for
each effect are computed across logit functions, but individual
parameters are estimated for each logit function.
Association of Observed Values and Predicted Probabilities
When the LINK=GLOGIT option is specified, the ``Association of Observed Values
and Predicted Probabilities'' table is suppressed unless there are only two
response levels; in that case, the generalized logit model is reduced to the
binary logit model.
Printing and Outputting Parameter Estimates
Each logit function has a set of parameters for the intercept and
covariates. Instead of printing and outputting the parameter estimates
by logit function, PROC LOGISTIC presents all parameter estimates
for the intercept first, followed by all estimates of the first
Since each logit function contrasts a nonreference
response category with the reference category, the "Analysis
of Maximum Likelihood Estimates" table
includes the response variable column whose
values are used to identify the corresponding logit function.
For the OUTEST= data set,
names of parameters corresponding to
the nonreference category `xxx' contain _xxx as the suffix.
For example, suppose the
represents the television network viewed at a
certain time, with values `ABC', `CBS', and `NBC'. The
following code fits a generalized logit model with Age and Gender (a
CLASS variable with values Female and Male) as explanatory variables.
model Net3 = Age Gender / link=glogit;
Since `NBC' is the last value in the sorted order of the response
categories, it corresponds to the default reference category.
There are two logit functions, one contrasting `ABC' with
`NBC' and the
other contrasting `CBS' with `NBC'. For each logit, there are three
parameters: an intercept parameter, a slope parameter for Age, and a slope
parameter for Gender (since there are only two gender levels and the
EFFECT parameterization is used by default). The names of the parameters
and their descriptions are as follows.
|Intercept_ABC||intercept parameter for the logit contrasting `ABC'|
| ||with `NBC'|
|Intercept_CBS||intercept parameter for the logit contrasting `CBS'|
| ||with `NBC'|
|Age_ABC||Age parameter for the logit contrasting `ABC' with|
|Age_CBS||Age parameter for the logit contrasting `CBS' with|
|GenderFemale_ABC||Gender=Female parameter for the logit contrasting `ABC'|
| ||with `NBC'|
|GenderFemale_CBS||Gender=Female parameter for the logit contrasting `CBS'|
| ||with `NBC'|
Out= Output Data Set
If any of the XBETA=, STDXBETA=, PREDICTED=, LOWER=, and UPPER= options are
specified in the OUTPUT statement when there are more than two response
categories, each input observation generates as many output observations
as the number of response categories. The predicted probabilities and
their confidence limits correspond to the probabilities of individual
response categories rather
than the cumulative probabilities as in the case of fitting a cumulative
model. Regression diagnostics are suppressed when there are more than
two response categories. You can specify PREDPROB=(I C) to obtain
the predicted probabilities of individual response categories as well as
the predicted cumulative probabilities.
Copyright © 2001 by SAS Institute Inc., Cary, NC, USA. All rights reserved.