22630 - Assessing fit and overdispersion in categorical generalized linear models

SUPPORT / SAMPLES & SAS NOTES

Support

Usage Note 22630: Assessing fit and overdispersion in categorical generalized linear models

Generalized linear models (GLMs) for categorical responses, including but not limited to logit, probit, Poisson, and negative binomial models, can be fit in the GENMOD, GLIMMIX, LOGISTIC, COUNTREG, GAMPL, and other SAS^® procedures. The categorical response in these models can be binary, multinomial (ordinal or nominal), or integer counts. For variations on logistic models that are available and the procedures that fit them, see SAS Note 22871. The following discussion addresses how to assess goodness (or lack) of fit and over- (or under-) dispersion in categorical response models.

Statistics and omnibus tests for assessing overall model fit

Misspecification of the model, such as by omitting higher-order terms like quadratic, cubic, or interaction terms or by the omission of important predictors, can be causes for lack of fit. The ASSESS statement in PROC GENMOD can help to determine whether the predictors (or link function) are correctly specified. See the example titled "Model Assessment of Multiple Regression Using Aggregates of Residuals" in the GENMOD documentation. Splines provide highly flexible transformations of variables and can be included in the model using PROC GAMPL or with the EFFECT statement that is available in several procedures.

Two aspects of overall model fit for binary response models are calibration and discrimination. Calibration is the degree to which predicted and observed probabilities agree. Discrimination is the degree to which the model can tell events from nonevents. A good model should have good calibration and good discrimination. Further discussion can be found in Hosmer and Lemeshow (2000) in SAS Note 22592. A measure of calibration is the Brier score, available with the FITSTAT option in the SCORE statement of PROC LOGISTIC and in PROC HPLOGISTIC when the PARTITION statement is specified. A test of calibration is provided by the Spiegelhalter test, available beginning in SAS^® 9.4M6 (TS1M6) in PROC LOGISTIC with the GOF option. Graphical assessment of calibration is provided by the PLOTS=CALIBRATION option in PROC LOGISTIC (which also displays the Spiegelhalter test). The calibration plot can also be obtained for multinomial response models. Measures of discrimination include Tjur's coefficient of discrimination and the area under the ROC curve (AUC, or concordance index, c). The AUC is provided by PROC LOGISTIC or by the ASSOCIATION option in PROC HPLOGISTIC. A test that AUC > 0.5, indicating better discrimination than chance, is provided by specifying the ROC and ROCCONTRAST statements. Graphical assessment of discrimination is provided by plotting the ROC curve with the ROC option or PLOTS=ROC option in PROC LOGISTIC. An extension of AUC for multinomial response models is available with the MultAUC macro (SAS Note 64029), but no associated test or plot is available. Tjur's statistic is provided by the GOF option in the MODEL statement of PROC LOGISTIC or by the PARTITION statement in PROC HPLOGISTIC as shown in SAS Note 39109. Also shown in the note is the use of the Kolmogorov-Smirnov (KS) test or Wilcoxon test as a test of the discriminatory ability of the fitted model.

Lack of fit and overdispersion can be assessed using the Pearson and deviance statistics available in many GLM procedures.^Note Overdispersion is said to exist when there is more variability than expected under the response distribution. For binary or multinomial response data, the Pearson and deviance statistics are computed by grouping observations into subpopulations. Details and formulas for these statistics are in the procedure documentation (SAS Note 22930). You can define the subpopulations by using the AGGREGATE= option in GENMOD, LOGISTIC, or PROBIT. This is necessary if the observations in the data represent single trials or subjects. It is also important if data was collected in subpopulations defined more precisely than by the covariates in the model as further described in SAS Note 23086. For binomial response data that is already summarized (or aggregated) and for which the events/trials syntax is used in the MODEL statement, each observation is a subpopulation by default unless the AGGREGATE= option is specified. A rough indicator of fit is provided by dividing either of these statistics by its degrees of freedom. The result should be approximately equal to one when no lack of fit or overdispersion exists. When either statistic deviates substantially from one, some form of lack of fit or, for binomial or count models, overdispersion is indicated. Lindsey (1999) in SAS Note 22572 suggests that overdispersion is possible if the deviance is at least twice the degrees of freedom.

The Pearson and deviance statistics are known to be chi-square distributed only in certain cases. In general their distribution is not known. For this reason, PROC GENMOD does not present p-values for these statistics. Generally, the larger the scaled statistics, the poorer is the fit. In the binary response case, sufficient replication is required in all subpopulations for these statistics to be chi-square distributed. Otherwise, the data is sparse and neither statistic is a reliable indicator of fit. One sign of insufficient replication is a large difference between the two statistics. For more details, see McCullagh and Nelder (1989), the "Overdispersion" sections in the PROC LOGISTIC and PROC GENMOD documentation, and the "Lack of Fit Tests" section in the PROC PROBIT documentation.

With no or insufficient replication in the subpopulations, the Hosmer-Lemeshow test available in the LOGISTIC and HPLOGISTIC procedures provides an overall test of model fit for the binary logistic model and, beginning in SAS^® 9.4M3 (TS1M3), also for multinomial logistic models. This is provided by the LACKFIT (or GOF) option in the MODEL statement. Beginning in SAS^® 9.4M6 (TS1M6), the GOF option in PROC LOGISTIC offers several additional goodness of fit tests valid even with sparse data (Orme's information matrix test, Osius-Rojek test, Copas' unweighted residual sum of squares test, Spiegelhalter's test, and Stukel's test) for binary logistic models. Example 4 in SAS Note 24447 describes other tests that can be used to compare nested and nonnested models.

Statistics that can be used to compare competing GLMs, including multinomial models, are the AIC, corrected AIC (AICC), BIC (also called SC), and R² statistics. Two likelihood-based R² statistics are available for binary or multinomial models with the RSQUARE option in PROC LOGISTIC. Beginning in SAS 9.4M6, several additional R² statistics are available for binary response models in PROC LOGISTIC with the GOF option in the MODEL statement or the FITSTAT option in the SCORE statement. Tjur's coefficient of discrimination is also provided for binary response models in PROC LOGISTIC and PROC HPLOGISTIC. In most GLM procedures, AIC, AICC, and BIC are provided by default. However, tests to compare models based on these statistics are not available.

Assessing fit at the observation level

Note that a model might fit some observations well but not others. Some observations might not be well fit because additional predictors or higher-order terms are needed in the model. There might also be outliers in the data that no reasonable change to the model specification can accommodate. To assess the fit of a binary response model at the observation level, you can examine the residuals provided by options in the OUTPUT statement in the GENMOD and LOGISTIC procedures.

In general, the distributions of the diagnostic statistics are unknown, so well-established cutoffs are not available. Assessment is typically done by plotting the diagnostics and looking for values that are far removed from the others. Collett (2003) recommends standardized deviance residuals (STDRESDEV= in GENMOD) or likelihood residuals (RESLIK= in GENMOD), stating that these two residuals perform similarly and are well-approximated by the standard normal distribution. As such, most values should lie between -2 and 2. McCullagh and Nelder (1989) also recommend standardized deviance residuals.

In logistic models, Hosmer and Lemeshow (2000) in SAS Note 22592 discuss residuals, diagnostics, useful plots, and their interpretation. They suggest that when there is sufficient replication at the settings of the predictors (such data would be analyzed using the events/trials syntax for summarized data), then the squared standardized Pearson residuals (DIFCHISQ= in LOGISTIC or squared STDRESCHI= in GENMOD), the deviance change values (DIFDEV= in LOGISTIC), and the squared standardized deviance residuals (squared STDRESDEV= in GENMOD) are approximately chi-square distributed with 1 degree of freedom so that values would generally be less than 4.

For multinomial models, predicted probabilities for each observation are available via options in the OUTPUT statements of the GENMOD, LOGISTIC, and PROBIT procedures.

Assessing fit in Generalized Estimating Equations (GEE) models

GEE models for clustered or longitudinal data can be fit by specifying the REPEATED statement in PROC GENMOD and (beginning in SAS^®9.4M2) in PROC GEE. For GEE models, no test of overall fit is currently available. Pearson and deviance statistics, if displayed, apply only to the initial model that begins the GEE estimation, not to the final GEE model. However, a generalization of the R-square statistic suitable for GEE models has been proposed and is discussed in SAS Note 67880. Also, a comparative statistic similar to AIC, known as QIC, is provided in PROC GENMOD and PROC GEE. You can assess fit of the GEE model at the observation or cluster level by using statistics available from the OUTPUT statement in GENMOD.

The ASSESS statement in GENMOD can be used to determine adequacy of the link function or whether the functional form of a predictor in the model is correct. See the example titled "Assessment of a Marginal Model for Dependent Data" in the GENMOD documentation. Deletion diagnostics and plots are provided for GEE models to assess the effects of deleting entire clusters. Deletion diagnostics are available in the OUTPUT statement of GENMOD and plots are provided by the PLOTS= option in the PROC GENMOD statement.

For GEE models using the binomial distribution, the ROC curve and the area below it can be used to assess the model fit. This can be done by saving the predicted probabilities and using them in the PRED= option of the ROC statement in PROC LOGISTIC in the same manner as shown in SAS Note 41364 for various types of binomial models. Competing models can be compared by using multiple ROC statements to import the predicted probabilities of the models and including the ROCCONTRAST statement similar to the example titled "Comparing Receiver Operating Characteristic Curves" in the LOGISTIC documentation.

Overdispersion (and underdispersion)

As noted above, overdispersion is said to exist when there is more variability than expected under the response distribution. In addition to lack of fit, overdispersion can also be detected by the Pearson and deviance statistics. For example, in Poisson models the variability should be equal to the mean because the mean and variance are identical in this distribution. When the data is more variable than that, it is said to be overdispersed. One way to adjust for overdispersion is to estimate a dispersion parameter (called the heterogeneity factor in PROC PROBIT) and inflate the covariance matrix of the parameter estimates by this factor. The dispersion parameter can be estimated by the Pearson or deviance statistic divided by its degrees of freedom (use the SCALE=PEARSON or SCALE=DEVIANCE option in the GENMOD, LOGISTIC, or PROBIT procedures). Because this ratio also indicates lack of fit, you should eliminate the possibility of poor fit before relying on this adjustment.

Overdispersion is common in count response models that typically use the Poisson or negative binomial distribution. Underdispersion can also occur. Most of the methods and models for count data discussed below are illustrated in SAS Note 56549.

You can test for overdispersion in a Poisson model by using the DIST=NEGBIN, SCALE=0, and NOSCALE options in the MODEL statement of PROC GENMOD. When used together, these options test whether overdispersion of the form μ+kμ² exists by testing whether the negative binomial dispersion parameter, k, is zero. When k=0, the negative binomial distribution is equivalent to the Poisson distribution. A significant test for k suggests preference for the negative binomial model. In this case, use the DIST=NEGBIN option to fit the negative binomial model, omitting the SCALE= and NOSCALE options to allow estimation of the dispersion parameter.

PROC COUNTREG (in SAS/ETS^®) also offers a test of the overdispersion parameter (labeled _Alpha) when fitting the negative binomial model. For count models using the Poisson or negative binomial distributions, PROC GENMOD (beginning in SAS 9.4M3) and PROC COUNTREG (beginning in SAS^® 9.4M1) provide a diagnostic plot for the type of overdispersion caused by excessive zero counts. This plot is obtained by fitting a zero-inflated model (zero-inflated Poisson (DIST=ZIP) or zero-inflated negative binomial (DIST=ZINB)) and specifying the PLOTS=OVERDISP option (in GENMOD) or the PLOTS=DISPERSION option (in COUNTREG). The resulting graph plots the predicted variance against the predicted mean under the zero-inflated model. For the ZIP model, deviation from the diagonal indicates overdispersion due to excessive zeros. For the ZINB model, deviation indicates either excessive zeros and/or more general overdispersion of the form (kμ²) allowed by the dispersion parameter, k, in the negative binomial distribution. Beginning in SAS 9.4M1, an overdispersion plot is also available in PROC COUNTREG for negative binomial and zero-inflated count models by specifying the PLOTS=DISPERSION option.

Williams' method can be used with binomial (events/trials) data in PROC LOGISTIC (SCALE=WILLIAMS option) to estimate an overdispersion parameter and adjust the parameter covariance matrix. This is illustrated in the example titled "Overdispersion" in the LOGISTIC documentation (SAS Note 22930).

You can also use alternative models that accommodate and adjust for overdispersion. For binomial (events/trials) data, the beta-binomial model (SAS Note 52285), the zero-inflated binomial model (SAS Note 52161), and the binomial cluster model can be used. The binomial cluster model is discussed and illustrated in the example titled "Modeling Mixing Probabilities: All Mice Are Created Equal, but Some Are More Equal" in the FMM procedure documentation (SAS Note 22930). For multinomial data, the multinomial cluster model is available beginning with SAS 9.4M2 in PROC FMM. This model is illustrated in the example titled "Modeling Multinomial Overdispersion: Town and Country." For count data, the zero-inflated Poisson, the negative binomial, the zero-inflated negative binomial, the generalized Poisson (SAS Note 56549), and the Conway-Maxwell Poisson models can be used. Poisson, negative binomial and corresponding zero-inflated models can all be fit using the GENMOD, COUNTREG, or FMM procedures. The Conway-Maxwell Poisson (CMP) model can be fit in PROC COUNTREG beginning in SAS 9.4M1. The CMP model can also be used to model underdispersed data. The DISPMODEL statement in PROC COUNTREG enables you to model the dispersion separately from the mean. The generalized Poisson model, which can also be used in cases of underdispersion, can be fit in the NLMIXED, FMM, and GLIMMIX procedures. Hurdle models (SAS Note 48506) can also be used for underdispersed cases. In binary or normal response models, the dispersion can be modeled separately from the mean by using the HETERO statement in the QLIM procedure in SAS/ETS software.

Another way of dealing with overdispersion is to use the generalized estimating equations (GEE) method to estimate the model. GEE is typically used when subjects or objects are repeatedly measured, but it can be used even when there is only a single measurement per subject. The robust standard errors of the GEE method provide an adjustment for overdispersion as discussed in Stokes et. al (SAS Note 22572).

__________

NOTE: To request the Pearson and deviance statistics, specify the SCALE=NONE option in the MODEL statement in PROC LOGISTIC or the LACKFIT option in PROC PROBIT.

Operating System and Release Information

Product Family	Product	System	SAS Release
			Reported	Fixed*
SAS System	SAS/STAT	All	n/a

* For software releases that are not yet generally available, the Fixed Release is the software release in which the problem is planned to be fixed.

Type:	Usage Note
Priority:	low
Topic:	SAS Reference ==> Procedures ==> GENMOD SAS Reference ==> Procedures ==> PROBIT Analytics ==> Longitudinal Analysis Analytics ==> Categorical Data Analysis SAS Reference ==> Procedures ==> LOGISTIC SAS Reference ==> Procedures ==> COUNTREG SAS Reference ==> Procedures ==> FMM SAS Reference ==> Procedures ==> GEE Analytics ==> Regression SAS Reference ==> Procedures ==> GLIMMIX SAS Reference ==> Procedures ==> HPCOUNTREG SAS Reference ==> Procedures ==> HPFMM SAS Reference ==> Procedures ==> HPLOGISTIC SAS Reference ==> Procedures ==> NLMIXED SAS Reference ==> Procedures ==> QLIM SAS Reference ==> Procedures ==> HPGENSELECT SAS Reference ==> Procedures ==> GAM SAS Reference ==> Procedures ==> GAMPL

Date Modified:	2024-03-27 17:04:07
Date Created:	2002-12-16 10:56:38

Support

Usage Note 22630: Assessing fit and overdispersion in categorical generalized linear models

Operating System and Release Information

Follow Us

What is...