Chapter Contents Previous Next
 The LOGISTIC Procedure

Generalized Logit Model

Formulation of the generalized logit models for nominal response variables can be found in Agresti (1990). Let Y be the response variable with categories 1, ... , r. Let x = (x0, x1, ... , xp)' be a (p+1) vector of covariates, with . By choosing k as the reference category, the jth logit is given by
where is a (p+1) vector of the regression coefficients for the jth logit.

For a sample of n subjects , the log likelihood for the generalized logit model is

and the log likelihood is maximized with respect to .

Newton's Method of Parameter Estimation

For the convenience of notation, consider the last response level to be the reference level. The response probabilities are given by

For a single response vector y with yj = 1 if Y=j and 0 otherwise, the log likelihood is

Note that and , so that

The first partial derivatives of with respect to are

Denote and . Then,

The estimating equations become

Let

Let . The estimate of is obtained iteratively as follows:

where i indexes the observations and wi is the product of the corresponding weight and frequency.

Confidence Limits for the Predicted Probabilities

By the delta method,
where

A 100(1)% confidence level for is given by

where is obtained by replacing by in .

No Intercept Model

When the NOINT option is specified with the LINK=GLOGIT option, all intercepts are suppressed. This differs from the cumulative model where only the first intercept is suppressed when the NOINT option is specified.

Fit Statistics

Suppose there are r response categories and s covariates (each dummy variable being counted as a separate covariate). The number of parameters estimated is p=(r-1)(1+s). Let L be the likelihood.

Akaike Information Criterion:

Schwartz Criterion:

where fj is the frequency of the jth observation.

Exact Conditional Analysis

When an EXACT statement and the LINK=GLOGIT option is specified, the generalized logit model is fit as described in Hirji (1992). If there are only two response levels, the binary logit model is fit instead. Hypothesis tests for each effect are computed across logit functions, but individual parameters are estimated for each logit function.

Association of Observed Values and Predicted Probabilities

When the LINK=GLOGIT option is specified, the Association of Observed Values and Predicted Probabilities'' table is suppressed unless there are only two response levels; in that case, the generalized logit model is reduced to the binary logit model.

Printing and Outputting Parameter Estimates

Each logit function has a set of parameters for the intercept and covariates. Instead of printing and outputting the parameter estimates by logit function, PROC LOGISTIC presents all parameter estimates for the intercept first, followed by all estimates of the first covariate, etc.

Since each logit function contrasts a nonreference response category with the reference category, the "Analysis of Maximum Likelihood Estimates" table includes the response variable column whose values are used to identify the corresponding logit function.

For the OUTEST= data set, names of parameters corresponding to the nonreference category xxx' contain _xxx as the suffix. For example, suppose the variable Net3 represents the television network viewed at a certain time, with values ABC', CBS', and NBC'. The following code fits a generalized logit model with Age and Gender (a CLASS variable with values Female and Male) as explanatory variables.

   proc logistic;
class Gender;
model Net3 = Age Gender / link=glogit;
run;

Since NBC' is the last value in the sorted order of the response categories, it corresponds to the default reference category. There are two logit functions, one contrasting ABC' with NBC' and the other contrasting CBS' with NBC'. For each logit, there are three parameters: an intercept parameter, a slope parameter for Age, and a slope parameter for Gender (since there are only two gender levels and the EFFECT parameterization is used by default). The names of the parameters and their descriptions are as follows.
 Parameter Description Intercept_ABC intercept parameter for the logit contrasting ABC' with NBC' Intercept_CBS intercept parameter for the logit contrasting CBS' with NBC' Age_ABC Age parameter for the logit contrasting ABC' with NBC' Age_CBS Age parameter for the logit contrasting CBS' with NBC' GenderFemale_ABC Gender=Female parameter for the logit contrasting ABC' with NBC' GenderFemale_CBS Gender=Female parameter for the logit contrasting CBS' with `NBC'

Out= Output Data Set

If any of the XBETA=, STDXBETA=, PREDICTED=, LOWER=, and UPPER= options are specified in the OUTPUT statement when there are more than two response categories, each input observation generates as many output observations as the number of response categories. The predicted probabilities and their confidence limits correspond to the probabilities of individual response categories rather than the cumulative probabilities as in the case of fitting a cumulative model. Regression diagnostics are suppressed when there are more than two response categories. You can specify PREDPROB=(I C) to obtain the predicted probabilities of individual response categories as well as the predicted cumulative probabilities.

 Chapter Contents Previous Next Top