The PHREG Procedure

Schemper-Henderson Predictive Measure

Measures of predictive accuracy of regression models quantify the extent to which covariates determine an individual outcome. Schemper and Henderson’s (2000) proposed predictive accuracy measure is defined as the difference between individual processes and the fitted survivor function.

For the ith individual ( $1\leq i \leq n$ ), let $l_ i, X_ i, \Delta _ i,$ and $\bZ _ i$ be the left-truncation time, observed time, event indicator (1 for death and 0 for censored), and covariate vector, respectively. If there is no delay entry, then $l_ i=0$ . Let $t_{(1)} < \cdots <t_{(m)}$ be m distinct event times with $d_ j$ deaths at $t_{(j)}$ . The survival process $Y_ i(t)$ for the ith individual is

$\begin{eqnarray*} Y_ i(t) = \left\{ \begin{array}{ll} 1 & l_ i \leq t < X_ i \\ 0 & t \geq X_ i ~ ~ \mr{and}~ ~ \Delta _ i=1 \\ \mr{undefined} & t \geq X_ i ~ ~ \mr{and}~ ~ \Delta _ i=0 \end{array} \right. \end{eqnarray*}$

Let $\hat{S}(t)$ be the Kaplan-Meier estimate of the survivor function (assuming no covariates). Let $\hat{S}(t|\bZ )$ be the fitted survivor function with covariates $\bZ$ , and if you specify TIES=EFRON, then $\hat{S}(t|\bZ )$ is computed by the Efron method; otherwise, the Breslow estimate is used.

The predictive accuracy is defined as the difference between individual survival processes $Y_ i(t)$ and the fitted survivor functions with ( $\hat{S}(t|\bZ _ i$ )) or without ( $\hat{S}(t)$ ) covariates between 0 and $\tau$ , the largest observed time. For each death time $t_{(j)}$ , define a mean absolute distance between the $Y_ i(t)$ and the $\hat{S}(t)$ as

$\begin{eqnarray*} \hat{M}(t_{(j)}) & =& \frac{1}{n_ j} \sum _{i=1}^ n I(l_ i \le t_{(j)}) \biggl \{ I(X_ i>t_{(j)} \ge l_ i)(1-\hat{S}(t_{(j)}))+ \Delta _ i I(X_ i \leq t_{(j)})\hat{S}(t_{(j)}) \\ & & + ~ (1-\Delta _ i)I(X_ i \le t_{(j)}) \left[ (1-\hat{S}(t_{(j)}) )\frac{\hat{S}(t_{(j)})}{\hat{S}(X_ i)} +\hat{S}(t_{(j)}) \left(1 - \frac{\hat{S}(t_{(j)})}{\hat{S}(X_ i)} \right) \right] \biggr \} \end{eqnarray*}$

where $n_ j = \sum _{i=1}^ nI(l_ i \le t_{(j)})$ . Let $\hat{M}(t_{(j)}|\bZ )$ be defined similarly to $\hat{M}(t_{(j)})$ , but with $\hat{S}(t_{(j)})$ replaced by $\hat{S}(t_{(j)}|\bZ _ i)$ and $\hat{S}(X_ i)$ replaced by $\hat{S}(X_ i|\bZ _ i)$ . Let $\hat{G}(t)$ be the Kaplan-Meier estimate of the censoring or potential follow-up distribution, and let

$w= \sum _{j=1}^ m \frac{d_ j}{\hat{G}(t_{(j)})}$

The overall estimator of the predictive accuracy with ( $\hat{D}_ z$ ) and without ( $\hat{D}$ ) covariates are weighted averages of $\hat{M}(t_{(j)}|\bZ )$ and $\hat{M}(t_{(j)})$ , respectively, given by

$\begin{eqnarray*} \hat{D_ z} & =& \frac{1}{w} \sum _{j=1}^ m \frac{d_ j}{\hat{G}(t_{(j)})}\hat{M}(t_{(j)}|\bZ ) \\ \hat{D} & =& \frac{1}{w} \sum _{j=1}^ m \frac{d_ j}{\hat{G}(t_{(j)})}\hat{M}(t_{(j)}) \end{eqnarray*}$

The explained variation by the Cox regression is

$V = 100 \left(1- \frac{\hat{D}_ z}{\hat{D}}\right) \%$

Because the predictive accuracy measures $\hat{D}_ z$ and $\hat{D}$ are based on differences between individual survival processes and fitted survivor functions, a smaller value indicates a better prediction. For this reason, $\hat{D}_ z$ and $\hat{D}$ are also referred to as predictive inaccuracy measures.