The PHREG Procedure

The Penalized Partial Likelihood Approach for Fitting Frailty Models

Let $\bgamma =(\gamma _1,\dots ,\gamma _ s)’$ be the vector of random components for the s clusters.

Gamma Frailty Model

Assume each $\mr{e}^{\gamma _ i}$ has an independent and identically distributed gamma distribution with mean 1 and a common unknown variance $\theta $; that is, $\mr{e}^{\gamma _ i}$ is iid $G\left(\frac{1}{\theta },\frac{1}{\theta } \right)$. The penalty function is

\[ - \frac{1}{\theta } \sum _{i=1}^ s \biggl ( \gamma _ i - \mr{e}^{\gamma _ i} \biggr ) \]

plus a function of $\theta $. The penalized partial log likelihood is given by

\[ l_{p}(\bbeta , \bgamma ,\theta ) =l_{\mr{partial}}(\bbeta , \bgamma ) + \frac{1}{\theta } \sum _{i=1}^ s \biggl ( \gamma _ i - \mr{e}^{\gamma _ i} \biggr ) \]

where $l_{\textrm{partial}}(\bbeta , \bgamma )$ is the log of any of the partial likelihoods in the sections Partial Likelihood Function for the Cox Model and The Multiplicative Hazards Model.

The profile marginal log-likelihood of this shared frailty model (Therneau and Grambsch 2000, pp. 257–258) is

\[ l_ m(\theta ) = l_{p}(\hat{\bbeta }(\theta ), \hat{\bgamma }(\theta ),\theta ) + \sum _{i=1}^ s \left\{ \theta ^{-1} - ( \theta ^{-1} + d_ i)\log \left[ \theta ^{-1} + d_ i \right] + \theta ^{-1}\log (\theta ^{-1}) + \log \left[ \frac{\Gamma (\theta ^{-1}+d_ i)}{\Gamma (\theta ^{-1})}\right] \right\} \]

where $d_ i$ is the number of events in the ith cluster.

The maximization of this approximate likelihood is a doubly iterative process that alternates between the following two steps:

  • For a provisional value of $\theta $, the best linear unbiased predictors (BLUP) of $\bbeta $ and $\bgamma $ are computed by maximizing the penalized partial log-likelihood $l_ p(\bbeta ,\bgamma ,\theta )$. The marginal likelihood is evaluated. This constitutes the inner loop.

  • A new value of $\theta $ is obtained by the golden section search based on the marginal likelihood of all the previous iterations. This constitutes the outer loop.

The outer loop is iterated until the bracketing interval of $\theta $ is small.

Lognormal Frailty Model

With each $\gamma _ i$ having a zero-mean normal distribution and a common variance $\theta $, the penalty function is

\[ \frac{1}{2\theta }\bgamma ’\bgamma \]

plus a function of $\theta $. The penalized partial log likelihood is given by

\[ l_{p}(\bbeta , \bgamma ,\theta ) =l_{\mr{partial}}(\bbeta , \bgamma ) - \frac{1}{2\theta }\bgamma ’\bgamma \]

where $l_{\mr{partial}}(\bbeta , \bgamma )$ is the log of any of the partial likelihoods in the sections Partial Likelihood Function for the Cox Model and The Multiplicative Hazards Model.

For a given $\theta $, let $\bH $ be the negative Hessian of the penalized partial log likelihood $l_ p(\bbeta ,\bgamma ,\theta )$; that is,

\[ \bH = \bH (\bbeta ,\bgamma )=\left[ \begin{array}{cc} \mc{\bH }_{11} & \mc{\bH }_{12} \\ \bH _{21} & \bH _{22} \\ \end{array} \right] \]

where $\bH _{11}=-\frac{\partial ^2 l_ p(\bbeta ,\bgamma ,\theta )}{\partial \bbeta ^2}, \bH _{12}= \bH ’_{21}=-\frac{\partial ^2 l_ p(\bbeta ,\bgamma ,\theta )}{\partial \bbeta \partial \bgamma }$, and $\bH _{22}=-\frac{\partial ^2 l_ p(\bbeta ,\bgamma ,\theta )}{\partial \bgamma ^2}$.

The marginal log likelihood of this shared frailty model is

\[ l_ m(\bbeta ,\theta ) = -\frac{1}{2} \log (\theta ^ s) + \log \biggr [\int \mr{e}^{l_ p(\bbeta ,\bgamma ,\theta )}d\bgamma \biggr ] \]

Using a Laplace approximation to the integral as in Breslow and Clayton (1993), an approximate marginal log likelihood (Ripatti and Palmgren 2000; Therneau and Grambsch 2000) is given by

\[ l_ m(\bbeta ,\theta ) \approx -\frac{1}{2} \log (\theta ^ s) -\frac{1}{2} \log (|\bH _{22}(\bbeta ,\tilde{\bgamma },\theta )|) -l_ p(\bbeta ,\tilde{\bgamma },\theta ) \]

The maximization of this approximate likelihood is a doubly iterative process that alternates between the following two steps:

  • For a provisional value of $\theta $, PROC PHREG computes the best linear unbiased predictors (BLUP) of $\bbeta $ and $\bgamma $ by maximizing the penalized partial log likelihood $l_ p(\bbeta ,\bgamma ,\theta )$. This constitutes the inner loop.

  • For $\bbeta $ and $\bgamma $ fixed at the BLUP values, PROC PHREG estimates $\theta $ by maximizing the approximate marginal likelihood $l_ m(\bbeta ,\theta )$. This constitutes the outer loop.

The outer loop is iterated until the difference between two successive estimates of $\theta $ is small.

The ML estimate of $\theta $ is

\[ \hat{\theta }= \frac{\hat{\bgamma }'\hat{\bgamma } + \mr{trace}(\bH _{22}^{-1})}{s} \]

The variance for $\hat{\theta }$ is

\[ \mr{var}(\hat{\theta }) = 2 \hat{\theta } \biggl [s + \frac{1}{\hat{\theta }^2}\mr{trace}(\bH _{22}^{-1}\bH _{22}^{-1}) -\frac{2}{\hat{\theta }}\mr{trace}(\bH _{22}^{-1}) \biggr ]^{-1} \]

The REML estimation of $\theta $ is obtained by replacing $(\bH _{22})^{-1}$ by $(\bH ^{-1})_{22}$.

The inverse of the final $\bH $ matrix is used as the variance estimate of $(\hat{\bbeta },\hat{\bgamma })’$.

The final BLUP estimates of the random components $\gamma _1,\ldots ,\gamma _ s$ can be displayed by using the SOLUTION option in the RANDOM statement. Also displayed are estimates of the lognormal frailties, which are the exponentiated estimates of the BLUP estimates.

Wald-Type Tests for Penalized Models

Let $\bI $ be the negative Hessian of the partial log likelihood $l_{\mr{partial}}(\bbeta ,\bgamma )$:

\[ \bI = \left[ \begin{array}{cc} \bI _{11} & \bI _{12} \\ \bI _{21} & \bI _{22} \\ \end{array} \right] \]

where $\bI _{11}=-\frac{\partial ^2 l_{\mr{partial}}(\bbeta ,\bgamma )}{\partial \bbeta ^2},$ $\bI _{12}= \bI ’_{21}=-\frac{\partial ^2 l_{\mr{partial}}(\bbeta ,\bgamma )}{\partial \bbeta \partial \bgamma }$, and $\bI _{22}=-\frac{\partial ^2 l_{\mr{partial}}(\bbeta ,\bgamma )}{\partial \bgamma ^2}$. Write $\btau ’=(\bbeta ’,\bgamma ’)’$. The Wald-type chi-square statistic for testing $H_0: \bC \btau =\mb{0}$ is

\[ (\bC \hat{\btau })’ (\bC \bH ^{-1} \bC ’)^{-1} (\bC \hat{\btau }) \]

Let $\bH $ be the negative Hessian of the penalized partial log likelihood $l_ p(\bbeta ,\bgamma ,\theta )$ at the ML estimate $\theta $; that is, $\bH = \frac{\partial ^2}{\partial \bbeta \partial \bgamma } l_ p(\bbeta ,\bgamma ,\hat{\theta })$. Let $\bV = \bH ^{-1} \bI \bH ^{-1}$. Gray (1992) recommends the following generalized degrees of freedom for the Wald test:

\[ \mr{DF}=\mr{trace}[(\bC \bH ^{-1}\bC ’)^{-1}\bC \bV \bC ’)] \]

See Therneau and Grambsch (2000, Section 5.8) for a discussion of this Wald-type test.

PROC PHREG uses the label "Adjusted DF" to represent this generalized degrees of freedom in the output.