The SURVEYPHREG Procedure

Residuals

This section describes the computation of residuals (RESMART, RESDEV, RESSCH, and RESSCO in the OUTPUT statement). See the section Notation and Estimation for definition of notation that is used in this section. The residuals are calculated based on the TIES= option in the MODEL statement.

TIES=BRESLOW

This is the default option. Let

\[  S^{(r)}(\bbeta ,t) = \sum _{A} w_{hij}y_{hij}(t) \exp \left( \bbeta ’\bZ _{hij}(t) \right) \bZ _{hij}^{\bigotimes r}(t)  \]
\[  \bar{\bZ }(\bbeta ,t) = \frac{S^{(1)}(\bbeta ,t)}{S^{(0)}(\bbeta ,t)}  \]

where $r = 0, 1$; and A be the set of indices in the selected sample.

Further let

$\displaystyle  d\Lambda _0(\bbeta ,t)  $
$\displaystyle  =  $
$\displaystyle  \sum _{A} \frac{ w_{hij} dn_{hij}(t)}{S^{(0)}(\bbeta ,t)}  $
$\displaystyle dM_{hij}(\bbeta ,t)  $
$\displaystyle  =  $
$\displaystyle  dn_{hij}(t) - y_{hij}(t) \exp \left( \bbeta ’ \bZ _{hij}(t) \right) d\Lambda _0(\bbeta ,t)  $

The martingale residual at t is defined as

$\displaystyle  \hat{M}_{hij}(t)  $
$\displaystyle  =  $
$\displaystyle  \int _0^ t dM_{hij}(\hat{\bbeta },\tau )  $
$\displaystyle  $
$\displaystyle  =  $
$\displaystyle  n_{hij}(t) - \int _0^ t y_{hij}(\tau ) \mbox{ exp} \left( \hat{\bbeta }’\bZ _{hij}(\tau ) \right) d\Lambda _0(\hat{\bbeta }, \tau )  $

Here $\hat{M}_{hij}(t)$ estimates the difference over $(0,t]$ between the observed number of events for the $(h,i,j)$ observation unit and a conditional expected number of events. The quantity $\hat{M}_{hij} \equiv \hat{M}_{hij}(\infty )$ is referred to as the martingale residual for the $(h, i, j)$ observation unit. For the Cox model with no time-dependent explanatory variables, the martingale residual for the $(h,i,j)$ unit with observation time $t_{(h,i,j)}$ and event status $\Delta _{(h,i,j)}$ is

\[  \hat{M}_{(h,i,j)} = \Delta _{(h,i,j)} - \mr {e}^{\hat{\bbeta }\bZ _{(h,i,j)}} \int _0^{t_{(h,i,j)}}d\Lambda _0(\hat{\bbeta },s)  \]

The deviance residual $D_{hij}$ for the $(h,i,j)$ observation unit is a transformation of the corresponding martingale residuals,

\[  D_{hij}= \text {sign}(\hat{M}_{hij})\sqrt {2 \biggl [ -\hat{M}_{hij} - n_{hij}(\infty ) \log \biggl ( \frac{n_{hij}(\infty ) - \hat{M}_{hij}}{n_{hij}(\infty )} \biggr ) \biggr ]}  \]

The square root shrinks large negative martingale residuals, while the logarithmic transformation expands martingale residuals that are close to unity. As such, the deviance residuals are more symmetrically distributed around zero than the martingale residuals. For the Cox model, the deviance residual reduces to the form

\[  D_{hij}= \text {sign}(\hat{M}_{hij})\sqrt {2 [ -\hat{M}_{hij} - \Delta _{hij} \log ( \Delta _{hij} - \hat{M}_{hij})]}  \]

The Schoenfeld (1982) residual vector is calculated on a per-event-time basis. At the kth event time $t_{hij,k}$ of the $(h,i,j)$ observation unit, the Schoenfeld residual

\[  \hat{\bU }_{hij}(t_{hij,k}) = \bZ _{hij}(t_{hij,k}) - \bar{\bZ }(\hat{\bbeta },t_{hij,k})  \]

is the difference between the observed covariate vector for the $(h,i,j)$ observation unit and the average of the covariate vectors over the risk set at $t_{hij,k}$. Under the proportional hazards assumption, the Schoenfeld residuals have the sample path of a random walk; therefore, they are useful in assessing time trend or lack of proportionality.

The score process for the $(h,i,j)$ subject at time t is

\[  \bL _{hij}(\bbeta ,t) = \int _{0}^{t} [\bZ _{hij}(\tau ) - \bar{\bZ }(\bbeta ,\tau )] dM_{hij}(\bbeta , \tau )  \]

The vector $\hat{\bL }_{hij} \equiv \bL _{hij}(\hat{\bbeta },\infty )$ is the score residual for the $(h,i,j)$ observation unit.

The score residuals are a decomposition of the first partial derivative of the log likelihood. They are useful in assessing the influence of each subject on individual parameter estimates. They also play an important role in the computation of the variance estimators.

TIES=EFRON

For TIES=EFRON, the preceding computation is modified to comply with the Efron partial likelihood. For a given uncensored time t, let $\delta _{hij}(t) = 1$ if t is an event time for the $(h,i,j)$ observation, and 0 otherwise. Let $d(t)=\sum _{hij \in A} \delta _{hij}(t)$, which is the number of observation units that have an event at t. For $1 \leq l \leq d(t)$, let

$\displaystyle  S^{(r)}(\bbeta , l, t)  $
$\displaystyle = $
$\displaystyle  \sum _{A} w_{hij} y_{hij}(t) \biggl \{  1- \frac{l-1}{d(t)} \delta _{hij}(t) \biggr \}  \mbox{ exp} \left( \bbeta ’\bZ _{hij}(t) \right) \bZ _{hij}^{\bigotimes r}(t) $
$\displaystyle \bar{\bZ }(\bbeta ,l,t)  $
$\displaystyle = $
$\displaystyle  \frac{ S^{(1)}(\bbeta ,l,t) }{ S^{(0)}(\bbeta ,l,t) }  $
$\displaystyle d\Lambda _0(\bbeta ,l,t)  $
$\displaystyle  =  $
$\displaystyle  \sum _{A} \frac{w_{hij} dn_{hij}(t)}{S^{(0)}(\bbeta ,l,t)}  $
$\displaystyle dM_{hij}(\bbeta ,l,t)  $
$\displaystyle  =  $
$\displaystyle  dn_{hij}(t) - y_{hij}(t) \biggl ( 1- \delta _{hij}(t) \frac{l-1}{d(t)} \biggr ) \mbox{ exp} \left(\bbeta ’ \bZ _{hij}(t) \right) d\Lambda _0(\bbeta ,l,t)  $

where $r = 0, 1$, and A are the set of indices in the selected sample.

The martingale residual at t for the $(h,i,j)$ observation unit is defined as

$\displaystyle  \hat{M}_{hij}(t)  $
$\displaystyle  =  $
$\displaystyle  \int _0^ t \frac{1}{d(\tau )} \sum _{l=1}^{d(\tau )} dM_{hij}(\hat{\bbeta },l,\tau )  $
$\displaystyle  $
$\displaystyle  =  $
$\displaystyle  n_{hij}(t) - \int _0^ t \frac{1}{d(\tau )} \sum _{l=1}^{d(\tau )} y_{hij}(\tau ) \biggl ( 1- \delta _{hij}(\tau ) \frac{l-1}{d(\tau )} \biggr ) \mbox{ exp} \left( \hat{\bbeta }’ \bZ _{hij}(\tau ) \right) d\Lambda _0(\hat{\bbeta },l,\tau )  $

Deviance residuals are computed by using the same transform on the corresponding martingale residuals as in TIES=BRESLOW.

The Schoenfeld residual vector for the $(h,i,j)$ observation unit at event time $t_{hij,k}$ is

\[  \hat{\bU }_{hij}(t_{hij,k}) = \bZ _{hij}(t_{hij,k}) - \frac{1}{d(t_{hij,k})}\sum _{l=1}^{d(t_{hij,k})} \bar{\bZ }(\hat{\bbeta },l,t_{hij,k})  \]

The score process for the $(h,i,j)$ observation unit at time t is

\[  \bL _{hij}(\bbeta ,t) = \int _0^ t \frac{1}{d(\tau )} \sum _{l=1}^{d(\tau )} \biggl (\bZ _{hij}(\tau ) - \bar{\bZ }(\bbeta ,l,\tau ) \biggr ) dM_{hij}(\bbeta ,l,\tau ) \\  \]