Generalized linear models (GLMs) for categorical responses, including but not limited to logit, probit, Poisson, and negative binomial models, can be fit in the GENMOD, GLIMMIX, LOGISTIC, COUNTREG, GAMPL, and other SAS^{®} procedures. The categorical response in these models can be binary, multinomial (ordinal or nominal), or integer counts. For variations on logistic models that are available and the procedures that fit them, see this note. The following discussion addresses how to assess goodness (or lack) of fit and over- (or under-) dispersion in categorical response models.

**Statistics and omnibus tests for assessing overall model fit**

Misspecification of the model, such as by omitting higher-order terms like quadratic, cubic, or interaction terms or by the omission of important predictors, can be causes for lack of fit. The ASSESS statement in PROC GENMOD can help to determine if the predictors (or link function) are correctly specified. Splines provide highly flexible transformations of variables and can be included in the model using PROC GAMPL or with the EFFECT statement that is available in several procedures.

Two aspects of overall model fit for binary response models are calibration and discrimination. Calibration is the degree to which predicted and observed probabilities agree. Discrimination is the degree to which the model can tell events from nonevents. A good model should have good calibration and good discrimination. Further discussion can be found in Hosmer and Lemeshow (2000). A measure of calibration is the Brier score, available with the FITSTAT option in the SCORE statement of PROC LOGISTIC and in PROC HPLOGISTIC when the PARTITION statement is specified. A test of calibration is provided by the Spiegelhalter test, available beginning in SAS 9.4 TS1M6 in PROC LOGISTIC with the GOF option. Graphical assessment of calibration is provided by the PLOTS=CALIBRATION option in PROC LOGISTIC (which also displays the Spiegelhalter test). The calibration plot can also be obtained for multinomial response models. A measure of discrimination is the area under the ROC curve (AUC, or concordance index, *c*) provided by PROC LOGISTIC or with the ASSOCIATION option in PROC HPLOGISTIC. A test that AUC > 0.5, indicating better discrimination than chance, is provided by specifying the ROC and ROCCONTRAST statements. Graphical assessment of discrimination is provided by plotting the ROC curve with the ROC option or PLOTS=ROC option in PROC LOGISTIC. An extension of AUC for multinomial response models is available with the MultAUC macro, but no associated test or plot is available.

Lack of fit and overdispersion can be assessed using the Pearson and deviance statistics available in many GLM procedures.^{Note} Overdispersion is said to exist when there is more variability than expected under the response distribution. For binary or multinomial response data, the Pearson and deviance statistics are computed by grouping observations into subpopulations. Details and formulas for these statistics are in the procedure documentation. You can define the subpopulations by using the AGGREGATE= option in GENMOD, LOGISTIC, or PROBIT. This is necessary if the observations in the data represent single trials or subjects. It is also important if data was collected in subpopulations defined more precisely than by the covariates in the model as further described in this note. For binomial response data that is already summarized (or aggregated) and for which the *events/trials* syntax is used in the MODEL statement, each observation is a subpopulation by default unless the AGGREGATE= option is specified. A rough indicator of fit is provided by dividing either of these statistics by its degrees of freedom. The result should be approximately equal to one when no lack of fit or overdispersion exists. When either statistic deviates substantially from one, some form of lack of fit or, for binomial or count models, overdispersion is indicated. Lindsey (1999) suggests that overdispersion is possible if the deviance is at least twice the degrees of freedom.

The Pearson and deviance statistics are known to be chi-square distributed only in certain cases. In general their distribution is not known. For this reason, PROC GENMOD does not present *p*-values for these statistics. Generally, the larger the scaled statistics, the poorer is the fit. In the binary response case, sufficient replication is required in all subpopulations for these statistics to be chi-square distributed. Otherwise, the data is sparse and neither statistic is a reliable indicator of fit. One sign of insufficient replication is a large difference between the two statistics. For more details, see McCullagh and Nelder (1989), the "Overdispersion" sections in the PROC LOGISTIC and PROC GENMOD documentation, and the "Lack of Fit Tests" section in the PROC PROBIT documentation.

With no or insufficient replication in the subpopulations, the Hosmer-Lemeshow test available in the LOGISTIC and HPLOGISTIC procedures provides an overall test of model fit for the binary logistic model and, beginning in SAS 9.4 TS1M3, also for multinomial logistic models. This is provided by the LACKFIT (or GOF) option in the MODEL statement. Beginning in SAS 9.4 TS1M6, the GOF option in PROC LOGISTIC offers several additional goodness of fit tests valid even with sparse data (Orme's information matrix test, Osius-Rojek test, Copas' unweighted residual sum of squares test, Spiegelhalter's test, and Stukel's test) for binary logistic models. Example 4 in this note describes other tests that can be used to compare nested and nonnested models.

Statistics that can be used to compare competing GLMs, including multinomial models, are the AIC, corrected AIC (AICC), BIC (also called SC), and R^{2} statistics. Two likelihood-based R^{2} statistics are available for binary or multinomial models with the RSQUARE option in PROC LOGISTIC. Beginning in SAS 9.4 TS1M6, several additional R^{2} statistics are available for binary response models in PROC LOGISTIC with the GOF option in the MODEL statement or the FITSTAT option in the SCORE statement. Tjur's coefficient of discrimination is also provided for binary response models in PROC LOGISTIC and PROC HPLOGISTIC. In most GLM procedures, AIC, AICC, and BIC are provided by default. However, tests to compare models based on these statistics are not available.

**Assessing fit at the observation level**

Note that a model might fit some observations well but not others. Some observations might not be well fit because additional predictors or higher-order terms are needed in the model. There might also be outliers in the data that no reasonable change to the model specification can accommodate. To assess the fit of a binary response model at the observation level, you can examine the residuals provided by options in the OUTPUT statement in the GENMOD and LOGISTIC procedures.

In general, the distributions of the diagnostic statistics are unknown, so well-established cutoffs are not available. Assessment is typically done by plotting the diagnostics and looking for values that are far removed from the others. Collett (2003) recommends standardized deviance residuals (STDRESDEV= in GENMOD) or likelihood residuals (RESLIK= in GENMOD), stating that these two residuals perform similarly and are well-approximated by the standard normal distribution. As such, most values should lie between -2 and 2. McCullagh and Nelder (1989) also recommend standardized deviance residuals.

In logistic models, Hosmer and Lemeshow (2000) discuss residuals, diagnostics, useful plots, and their interpretation. They suggest that when there is sufficient replication at the settings of the predictors (such data would be analyzed using the events/trials syntax for summarized data), then the squared standardized Pearson residuals (DIFCHISQ= in LOGISTIC or squared STDRESCHI= in GENMOD), the deviance change values (DIFDEV= in LOGISTIC), and the squared standardized deviance residuals (squared STDRESDEV= in GENMOD) are approximately chi-square distributed with 1 degree of freedom so that values would generally be less than 4.

For multinomial models, predicted probabilities for each observation are available via options in the OUTPUT statements of the GENMOD, LOGISTIC, and PROBIT procedures.

**Assessing fit in Generalized Estimating Equations (GEE) models**

GEE models for clustered or longitudinal data can be fit by specifying the REPEATED statement in PROC GENMOD and (beginning in SAS 9.4 TS1M2) in PROC GEE. For GEE models, no test of overall fit is currently available. Pearson and deviance statistics, if displayed, apply only to the initial model that begins the GEE estimation, not to the final GEE model. However, a comparative statistic similar to AIC, known as QIC, is provided in PROC GENMOD and PROC GEE. You can assess fit of the GEE model at the observation or cluster level by using statistics available from the OUTPUT statement in GENMOD. The ASSESS statement in GENMOD can be used to determine adequacy of the link function or whether the functional form of a predictor in the model is correct. Deletion diagnostics and plots are provided for GEE models to assess the effects of deleting entire clusters. Deletion diagnostics are available in the OUTPUT statement of GENMOD and plots are provided by the PLOTS= option in the PROC GENMOD statement.

**Overdispersion (and underdispersion)**

As noted above, overdispersion is said to exist when there is more variability than expected under the response distribution. In addition to lack of fit, overdispersion can also be detected by the Pearson and deviance statistics. For example, in Poisson models the variability should be equal to the mean because the mean and variance are identical in this distribution. When the data is more variable than that, it is said to be overdispersed. One way to adjust for overdispersion is to estimate a dispersion parameter (called the *heterogeneity factor* in PROC PROBIT) and inflate the covariance matrix of the parameter estimates by this factor. The dispersion parameter can be estimated by the Pearson or deviance statistic divided by its degrees of freedom (use the SCALE=PEARSON or SCALE=DEVIANCE option in the GENMOD, LOGISTIC, or PROBIT procedures). Because this ratio also indicates lack of fit, you should eliminate the possibility of poor fit before relying on this adjustment.

Overdispersion is common in count response models which typically use the Poisson or negative binomial distribution. You can test for overdispersion in a Poisson model by using the DIST=NEGBIN, SCALE=0, and NOSCALE options in the MODEL statement of PROC GENMOD. When used together, these options test whether overdispersion of the form μ+*k*μ^{2} exists by testing whether the negative binomial dispersion parameter, *k*, is zero. When *k*=0, the negative binomial distribution is equivalent to the Poisson distribution. A significant test for *k* suggests preference for the negative binomial model. In this case, use the DIST=NEGBIN option to fit the negative binomial model, omitting the SCALE= and NOSCALE options to allow estimation of the dispersion parameter.

PROC COUNTREG (in SAS/ETS^{®}) also offers a test of the overdispersion parameter (labeled _Alpha) when fitting the negative binomial model. For count models using the Poisson or negative binomial distributions, PROC GENMOD (beginning in SAS 9.4 TS1M3) and PROC COUNTREG (beginning in SAS 9.4 TS1M1) provide a diagnostic plot for the type of overdispersion caused by excessive zero counts. This plot is obtained by fitting a zero-inflated model (zero-inflated Poisson (DIST=ZIP) or zero-inflated negative binomial (DIST=ZINB)) and specifying the PLOTS=OVERDISP option (in GENMOD) or the PLOTS=DISPERSION option (in COUNTREG). The resulting graph plots the predicted variance against the predicted mean under the zero-inflated model. Deviation from the diagonal indicates overdispersion due to excessive zeros. Beginning in SAS 9.4 TS1M1, an overdispersion plot is also available in PROC COUNTREG for negative binomial and zero-inflated count models by specifying the PLOTS=DISPERSION option.

Williams' method can be used with binomial (events/trials) data in PROC LOGISTIC (SCALE=WILLIAMS option) to estimate an overdispersion parameter and adjust the parameter covariance matrix. This is illustrated in the example titled "Overdispersion" in the LOGISTIC documentation.

You can also use alternative models that accommodate and adjust for overdispersion. For binomial (events/trials) data, the beta-binomial model, the zero-inflated binomial model, and the binomial cluster model can be used. The binomial cluster model is discussed and illustrated in the example titled "Modeling Mixing Probabilities: All Mice Are Created Equal, but Some Are More Equal" in the FMM procedure documentation. For multinomial data, the multinomial cluster model is available beginning with SAS 9.4 TS1M2 in PROC FMM. This model is illustrated in the example titled "Modeling Multinomial Overdispersion: Town and Country." For count data, the zero-inflated Poisson, the negative binomial, the zero-inflated negative binomial, the generalized Poisson, and the Conway-Maxwell Poisson models can be used. Poisson, negative binomial and corresponding zero-inflated models can all be fit using the GENMOD, COUNTREG, or FMM procedures. The Conway-Maxwell Poisson (CMP) model can be fit in PROC COUNTREG beginning in SAS 9.4 TS1M1. The CMP model can also be used to model underdispersed data. The DISPMODEL statement in PROC COUNTREG enables you to model the dispersion separately from the mean. The generalized Poisson model, which can also be used in cases of underdispersion, can be fit in the NLMIXED and GLIMMIX procedures as shown in this note. Hurdle models can also be used for underdispersed cases. In binary or normal response models, the dispersion can be modeled separately from the mean by using the HETERO statement in the QLIM procedure in SAS/ETS software.

Another way of dealing with overdispersion is to use the generalized estimating equations (GEE) method to estimate the model. GEE is typically used when subjects or objects are repeatedly measured, but it can be used even when there is only a single measurement per subject. The robust standard errors of the GEE method provide an adjustment for overdispersion as discussed in Stokes et. al.

__________

NOTE: To request the Pearson and deviance statistics, specify the SCALE=NONE option in the MODEL statement in PROC LOGISTIC or the LACKFIT option in PROC PROBIT.Product Family | Product | System | SAS Release | |

Reported | Fixed* | |||

SAS System | SAS/STAT | All | n/a |

Type: | Usage Note |

Priority: | low |

Topic: | SAS Reference ==> Procedures ==> GENMOD SAS Reference ==> Procedures ==> PROBIT Analytics ==> Longitudinal Analysis Analytics ==> Categorical Data Analysis SAS Reference ==> Procedures ==> LOGISTIC SAS Reference ==> Procedures ==> COUNTREG SAS Reference ==> Procedures ==> FMM SAS Reference ==> Procedures ==> GEE Analytics ==> Regression SAS Reference ==> Procedures ==> GLIMMIX SAS Reference ==> Procedures ==> HPCOUNTREG SAS Reference ==> Procedures ==> HPFMM SAS Reference ==> Procedures ==> HPLOGISTIC SAS Reference ==> Procedures ==> NLMIXED SAS Reference ==> Procedures ==> QLIM SAS Reference ==> Procedures ==> HPGENSELECT SAS Reference ==> Procedures ==> GAM SAS Reference ==> Procedures ==> GAMPL |

Date Modified: | 2019-05-23 16:12:20 |

Date Created: | 2002-12-16 10:56:38 |