The GENMOD Procedure

Assessment of Models Based on Aggregates of Residuals

Lin, Wei, and Ying (2002) present graphical and numerical methods for model assessment based on the cumulative sums of residuals over certain coordinates (such as covariates or linear predictors) or some related aggregates of residuals. The distributions of these stochastic processes under the assumed model can be approximated by the distributions of certain zero-mean Gaussian processes whose realizations can be generated by simulation. Each observed residual pattern can then be compared, both graphically and numerically, with a number of realizations from the null distribution. Such comparisons enable you to assess objectively whether the observed residual pattern reflects anything beyond random fluctuation. These procedures are useful in determining appropriate functional forms of covariates and link function. You use the ASSESS|ASSESSMENT statement to perform this kind of model-checking with cumulative sums of residuals, moving sums of residuals, or LOESS smoothed residuals. See Example 40.8 and Example 40.9 for examples of model assessment.

Let the model for the mean be

\[  g(\mu _ i) = \mb {x}_ i^\prime \bbeta  \]

where $\mu _ i$ is the mean of the response $y_ i$ and $\mb {x}_ i$ is the vector of covariates for the ith observation. Denote the raw residual resulting from fitting the model as

\[  e_ i = y_ i - \hat{\mu }_ i  \]

and let $x_{ij}$ be the value of the jth covariate in the model for observation i. Then to check the functional form of the jth covariate, consider the cumulative sum of residuals with respect to $x_{ij}$,

\[  W_ j(x) = \frac{1}{\sqrt {n}}\sum _{i=1}^ n I(x_{ij} \le x) e_ i  \]

where $I()$ is the indicator function. For any x, $W_ j(x)$ is the sum of the residuals with values of $x_ j$ less than or equal to x.

Denote the score, or gradient vector, by

\[  U(\bbeta ) = \sum _{i=1}^ n h(\mb {x}^\prime \bbeta )\mb {x}_ i (y_ i - \nu (\mb {x}^\prime \bbeta ))  \]

where $\nu (r) = g^{-1}(r)$, and

\[  h(r) = \frac{1}{g\prime (\nu (r))V(\nu (r))}  \]

Let $\bJ $ be the Fisher information matrix

\[  \bJ (\bbeta ) = -\frac{\partial U(\bbeta )}{\partial \bbeta ^\prime }  \]

Define

\[  \hat{W}_ j(x) = \frac{1}{\sqrt {n}}\sum _{i=1}^ n [I(x_{ij} \le x) + \bm {\eta }^\prime (x;\hat{\bbeta })\bJ ^{-1}(\hat{\bbeta })\mb {x}_ i h(\mb {x}^\prime \hat{\bbeta })]e_ i Z_ i  \]

where

\[  \bm {\eta }(x;\bbeta ) = -\sum _{i=1}^ n I(x_{ij} \le x) \frac{\partial \nu (\mb {x}_ i^\prime \bbeta )}{\partial \bbeta }  \]

and $Z_ i$ are independent $N(0,1)$ random variables. Then the conditional distribution of $\hat{W}_ j(x)$, given $(y_ i, \mb {x}_ i), i = 1,\ldots ,n$, under the null hypothesis $H_0$ that the model for the mean is correct, is the same asymptotically as $n\rightarrow \infty $ as the unconditional distribution of $W_ j(x)$ (Lin, Wei, and Ying, 2002).

You can approximate realizations from the null hypothesis distribution of $W_ j(x)$ by repeatedly generating normal samples $Z_ i, i = 1,\ldots ,n$, while holding $(y_ i, \mb {x}_ i), i = 1,\ldots ,n$, at their observed values and computing $\hat{W}_ j(x)$ for each sample.

You can assess the functional form of covariate j by plotting a few realizations of $\hat{W}_ j(x)$ on the same plot as the observed $W_ j(x)$ and visually comparing to see how typical the observed $W_ j(x)$ is of the null distribution samples.

You can supplement the graphical inspection method with a Kolmogorov-type supremum test. Let $s_ j$ be the observed value of $S_ j = \sup _ x|W_ j(x)|$. The p-value $\mr {Pr}[S_ j \ge s_ j]$ is approximated by $\mr {Pr}[\hat{S}_ j \ge s_ j]$, where $\hat{S}_ j = \sup _ x|\hat{W}_ j(x)|$. $\mr {Pr}[\hat{S}_ j \ge s_ j]$ is estimated by generating realizations of $\hat{W}_ j(.)$ (1,000 is the default number of realizations).

You can check the link function instead of the jth covariate by using values of the linear predictor $\mb {x}_ i^\prime \hat{\bbeta }$ in place of values of the jth covariate $x_{ij}$. The graphical and numerical methods described previously are then sensitive to inadequacies in the link function.

An alternative aggregate of residuals is the moving sum statistic

\[  W_ j(x,b) = \frac{1}{\sqrt {n}}\sum _{i=1}^ n I(x-b\le x_{ij}\le x)e_ i  \]

If you specify the keyword WINDOW(b), then the moving sum statistic with window size b is used instead of the cumulative sum of residuals, with $I(x-b\le x_{ij}\le x)$ replacing $I(x_{ij}\le x)$ in the earlier equation.

If you specify the keyword LOESS(f), loess smoothed residuals are used in the preceding formulas, where f is the fraction of the data to be used at a given point. If f is not specified, $f=\frac{1}{3}$ is used. For data $(Y_ i, X_ i), i = 1,\ldots ,n$, define r as the nearest integer to $nf$ and h as the rth smallest among $|X_ i-x|, i=1,\ldots ,n$. Let

\[  K_ i(x) = K\left(\frac{X_ i-x}{h}\right) \]

where

\[  K(t) = \frac{70}{81}(1-|t|^3)^3I(-1\le t \le 1) \]

Define

\[ w_ i(x) = K_ i(x)[S_2(x)-(X_ i-x)S_1(x)]  \]

where

\[ S_1(x) = \sum _{i=1}^ n K_ i(x)(X_ i-x)  \]
\[ S_2(x) = \sum _{i=1}^ n K_ i(x)(X_ i-x)^2  \]

Then the loess estimate of Y at x is defined by

\[  \hat{Y}(x) = \sum _{i=1}^ n\frac{w_ i(x)}{\sum _{i=1}^ nw_ i(x)}Y_ i  \]

Loess smoothed residuals for checking the functional form of the jth covariate are defined by replacing $Y_ i$ with $e_ i$ and $X_ i$ with $x_{ij}$. To implement the graphical and numerical assessment methods, $I(x_{ij} \le x )$ is replaced with $ \frac{w_ i(x)}{\sum _{i=1}^ nw_ i(x)} $ in the formulas for $W_ j(x)$ and $\hat{W}_ j(x)$.

You can perform the model checking described earlier for marginal models for dependent responses fit by generalized estimating equations (GEEs). Let $y_{ik}$ denote the kth measurement on the ith cluster, $i=1,\ldots ,K$, $k=1,\ldots ,n_ i$, and let $\mb {x}_{ik}$ denote the corresponding vector of covariates. The marginal mean of the response $\mu _{ik}=\mr {E}(y_{ik})$ is assumed to depend on the covariate vector by

\[  g(\mu _{ik})=\mb {x}^\prime _{ik}\bbeta  \]

where g is the link function.

Define the vector of residuals for the ith cluster as

\[  \mb {e}_ i = (e_{i1},\ldots ,e_{in_ i})^\prime = (y_{i1}-\hat{\mu }_{i1},\ldots ,y_{in_ i}-\hat{\mu }_{in_ i})^\prime  \]

You use the following extension of $W_ j(x)$ defined earlier to check the functional form of the jth covariate:

\[ W_ j(x)=\frac{1}{\sqrt {K}}\sum _{i=1}^ K\sum _{k=1}^{n_ i}I( x_{ikj} \le x)e_{ik}  \]

where $x_{ikj}$ is the jth component of $\mb {x}_{ik}$.

The null distribution of $W_ j(x)$ can be approximated by the conditional distribution of

\[  \hat{W}_ j(x)= \frac{1}{\sqrt {K}}\sum _{i=1}^ K\left\{ \sum _{k=1}^{n_ i}I( x_{ikj} \le x)e_{ik} + \bm {\eta }^\prime (x,\hat{\bbeta }) \mb {I}_0^{-1} \hat{\mb {D}}^\prime _ i\hat{\mb {V}}_ i^{-1}\mb {e}_ i \right\} Z_ i  \]

where $\hat{\mb {D}}_ i$ and $\hat{\mb {V}}_ i$ are defined as in the section Generalized Estimating Equations with the unknown parameters replaced by their estimated values,

\[  \bm {\eta }(x,\bbeta ) = -\sum _{i=1}^ K\sum _{k=1}^{n_ i} I(x_{ikj} \le x)\frac{\partial \mu _{ik}}{\partial \bbeta }  \]
\[  \mb {I}_0 = \sum _{i=1}^ K\hat{\mb {D}}^\prime _ i\hat{\mb {V}}^{-1}_ i\hat{\mb {D}}_ i  \]

and $Z_ i, i=1,\ldots ,K$, are independent $N(0,1)$ random variables. You replace $x_{ikj}$ with the linear predictor $\mb {x}^\prime _{ik}\hat{\bbeta }$ in the preceding formulas to check the link function.