PROC LIFETEST: Kernel-Smoothed Hazard Estimate :: SAS/STAT(R) 9.2 User's Guide, Second Edition

The LIFETEST Procedure

Kernel-Smoothed Hazard Estimate

Kernel-smoothed estimators of the hazard function $\text{[math]}$ are based on the Nelson-Aalen estimator $\text{[math]}$ and its variance $\text{[math]}$ . Consider the jumps of $\text{[math]}$ and $\text{[math]}$ at the event times $\text{[math]}$ as follows:

	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$

where $\text{[math]}$ =0.

The kernel-smoothed estimator of $\text{[math]}$ is a weighted average of $\text{[math]}$ over event times that are within a bandwidth distance $\text{[math]}$ of $\text{[math]}$ . The weights are controlled by the choice of kernel function, $\text{[math]}$ , defined on the interval [–1,1]. The choices of $\text{[math]}$ are as follows:

uniform kernel

$\text{[math]}$
Epanechnikov kernel

$\text{[math]}$
biweight kernel

$\text{[math]}$

The kernel-smoothed hazard rate estimator is defined for all time points on $\text{[math]}$ . For time points $\text{[math]}$ for which $\text{[math]}$ , the kernel-smoothed estimated of $\text{[math]}$ based on the kernel $\text{[math]}$ is given by

$\text{[math]}$

The variance of $\text{[math]}$ is estimated by

$\text{[math]}$

For $\text{[math]}$ , the symmetric kernels $\text{[math]}$ are replaced by the corresponding asymmetric kernels of Gasser and Müller (1979). Let $\text{[math]}$ . The modified kernels are as follows:

uniform kernel

$\text{[math]}$
Epanechnikov kernel

$\text{[math]}$
biweight kernel

$\text{[math]}$

For $\text{[math]}$ , let $\text{[math]}$ . The asymmetric kernels for $\text{[math]}$ are used with $\text{[math]}$ replaced by $\text{[math]}$ .

Using the log transform on the smoothed hazard rate, the 100(1– $\text{[math]}$ )% pointwise confidence interval for the smoothed hazard rate $\text{[math]}$ is given by

$\text{[math]}$

where $\text{[math]}$ is the 100(1– $\text{[math]}$ )th percentile of the standard normal distribution.

Optimal Bandwidth

The following mean integrated squared error (MISE) over the range $\text{[math]}$ and $\text{[math]}$ is used as a measure of the global performance of the kernel function estimator

	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$

The last term is independent of the choice of the kernel and bandwidth and can be ignored when you are looking for the best value of $\text{[math]}$ . The first integral can be approximated by using the trapezoid rule by evaluating $\text{[math]}$ at a grid of points $\text{[math]}$ . You can specify $\text{[math]}$ , and $\text{[math]}$ by using the options MISEMIN=, MISEMAX=, and MISENUM=, respectively, of the HAZARD plot. The second integral can be estimated by Ramlau-Hansen (1983a, 1983b) cross-validation estimate

$\text{[math]}$

Therefore, for a fixed kernel, the optimal bandwidth is the quantity $\text{[math]}$ that minimizes

$\text{[math]}$

The minimization is carried out by the golden section search algorithm.

Top of Page