This example reproduces analysis done by Portnoy (2003), which demonstrates how to use quantile regression to analyze survival times. The example uses drug abuse data from Hosmer and Lemeshow (1999). The goal of this study is to compare treatment effects on reducing drug abuse.
The data set contains the following variables:
Time
, time to return to drug use, in days
Status
, event indicator, with value 1 for return to drug use and value 0 for censored time
Age
, age at enrollment, in years
Treatment
, with value 1 for six-month treatment and value 0 for three-month treatment
Beck
, Beck Depression Inventory score at admission to the program
IV3
, indicator of recent intravenous drug use
NDT
, number of prior drug treatments
Race
, race indicator, with value 1 for white and value 0 for nonwhite
Site
, treatment sites (A and B)
LOT
, length of treatment, in days
The following statements create the data set UIS
:
data uis; input ID Age Becktota Hercoc Ivhx Ndrugtx Race Treat Site Lot Time Censor; Iv3 = (Ivhx = 3); Nd1 = 1/((Ndrugtx+1)/10); Nd2 = (1/((Ndrugtx+1)/10))*log((Ndrugtx+1)/10); if (Treat =1 ) then Frac = Lot/180; else Frac = Lot/90; datalines; 1 39 9.0000 4 3 1 0 1 0 123 188 1 2 33 34.0000 4 2 8 0 1 0 25 26 1 3 33 10.0000 2 3 3 0 1 0 7 207 1 4 32 20.0000 4 3 1 0 0 0 66 144 1 ... more lines ... 626 28 10.0 4 2 3 0 1 1 21 35 1 627 35 17.0 1 3 2 0 0 1 184 379 1 628 46 31.5 1 3 15 1 1 1 9 377 1 ;
The following statements replicate the analysis of Portnoy (2003):
ods graphics on; proc quantlife data=uis log seed=999 plots=(quantplot survival); class Race Site Treat; model Time*Censor(0)=Nd1 Nd2 Iv3 Becktota Treat Frac Race Age|Site / quantile=0.05 to 0.85 by 0.05 ; baseline out=Predsurvf; run;
Output 94.2.1 displays the model information. Out of 628 subjects, 53 contain missing values and are not included in the analysis. The censoring rate is 20.87%.
Output 94.2.1: Model Information
Output 94.2.2, Output 94.2.3, and Output 94.2.4 display regression quantile process plots for each covariate.
Output 94.2.2: Quantile Processes with 95% Confidence Bands
Output 94.2.3: Quantile Processes with 95% Confidence Bands
Output 94.2.4: Quantile Processes with 95% Confidence Bands
You can see the varying effects for Nd
and Frac
, whereas the treatment effect is fairly constant. For more information about the covariate effects that can be discovered
using quantile regression, see Portnoy (2003).
In survival analysis, a plot of the estimated survival function is often of interest. There is a one-to-one relationship between the quantile function and the survival function. When you specify the PLOTS= SURVIVAL option, the QUANTLIFE procedure estimates the survival function by fitting a quantile regression model for a grid of equally spaced quantile levels. You can specify the grid points by using the INITTAU=option and the step between adjacent grid points by using the GRIDSIZE=option. For more information, see the section Kaplan-Meier-Type Estimator for Censored Quantile Regression.
FigureĀ 94.4 shows the estimated survival function at the reference set of covariate values that consist of reference levels for the CLASS variables and average values for the continuous variables. You can output the predicted survival function by specifying the SURVIVAL= option in the BASELINE statement.
Output 94.2.5: Survival Function