# The PHREG Procedure

### Survivor Function Estimators

Subsections:

Three estimators of the survivor function are available: the Breslow (1972) estimator, which is based on the empirical cumulative hazard function, the Fleming and Harrington (1984) estimator, which is a tie-breaking modification of the Breslow estimator, and the product-limit estimator (Kalbfleisch and Prentice 1980, pp. 84–86).

Let be the distinct uncensored times of the survival data.

#### Breslow Estimator

To select this estimator, specify the METHOD=BRESLOW option in the BASELINE statement or OUTPUT statement. For the jth subject, let represent the failure time, the event indicator, and the vector of covariate values, respectively. For , let

Note that is the number of subjects that have an event at t. Let

For a given realization of the explanatory variables , the cumulative hazard function estimator at is

with variance estimated by

where

For the marginal model, the variance estimator computation follows Spiekerman and Lin (1998).

The Breslow estimate of the survivor function for is

By the delta method, the standard error of is approximated by

#### Fleming-Harrington Estimator

To select this estimator, specify the METHOD=FH option in the BASELINE statement or OUTPUT statement. With and as defined in the section Breslow Estimator and for , let

For a given realization of the explanatory variables, the Fleming-Harrington adjustment of the cumulative hazard function is

with variance estimated by

where

The Fleming-Harrington estimate of the survivor function for is

By the delta method, the standard error of is approximated by

#### Product-Limit Estimator

To select this estimator, specify the METHOD=PL option in the BASELINE statement or OUTPUT statement. Let denote the set of individuals that fail at . Let denote the set of individuals that are censored in the half-open interval , where and . Let denote the censoring times in , where l ranges over .

The likelihood function for all individuals is given by

where is empty. The likelihood is maximized by taking for and allowing the probability mass to fall only on the observed event times , , . By considering a discrete model with hazard contribution at , you take , where . Substitution into the likelihood function produces

If you replace with estimated from the partial likelihood function and then maximize with respect to , the maximum likelihood estimate of becomes a solution of

When only a single failure occurs at , can be found explicitly. Otherwise, an iterative solution is obtained by the Newton method.

The baseline survival function is estimated by

For a given realization of the explanatory variables , the product-limit estimate of the survival function at is

Approximating the variance of by the variance estimate of the Breslow estimator of the cumulative hazard function, the variance of the product-limit estimator at is given by

Consider the Breslow estimator of the survival function. For , let represent the covariate set of the jth patient. The direct adjusted survival curve averages the estimated survival curves for each patient:

The variance of can be estimated by

where

##### Comparison of Direct Adjusted Probabilities of Two Strata

For a stratified Cox model, let k index the strata. For the jth patient, let and be the estimated survival function and the vector for the kth stratum. The direct adjusted survival curve for the kth stratum is

The variance of can be estimated by

where

##### Comparison of Direct Adjusted Survival Probabilities of Two Treatments

For , let represent the covariate set of the jth patient with the kth treatment, . The direct adjusted survival curve for the kth treatment is

The variance of can be estimated by

where

#### Confidence Intervals for the Survivor Function

When the computation of confidence limits for the survivor function is based on the asymptotic normality of the survival estimator —which can be the Breslow estimator , the Fleming-Harrington estimator , or the product-limit estimator —the approximate confidence interval might include impossible values outside the range [0,1] at extreme values of t. This problem can be avoided by applying the asymptotic normality to a transformation of for which the range is unrestricted. In addition, certain transformed confidence intervals for perform better than the usual linear confidence intervals (Borgan and Liestøl 1990). The CLTYPE= option in the BASELINE statement enables you to choose one of the following transformations: the log-log function, the log function, and the linear function.

Let g be the transformation that is being applied to the survivor function . By the delta method, the standard error of is estimated by

where is the first derivative of the function g. The 100(1–)% confidence interval for is given by

where is the inverse function of g. The choices for the transformation g are as follows:

• CLTYPE=NORMAL specifies linear transformation, which is the same as having no transformation in which g is the identity. The 100(1–)% confidence interval for is given by

• CLTYPE=LOG specifies log transformation. The estimated variance of is The 100(1–)% confidence interval for is given by

• CLTYPE=LOGLOG specifies log-log transformation. The estimated variance of is The 100(1–)% confidence interval for is given by