Example 89.5 A Test of the Proportional Hazards Assumption by Using the Programming Statements

You can use programming statements in PROC SURVEYPHREG to create time-dependent covariates to test the proportional hazards assumption for complex survey data. Consider the data set mortality from Example 89.3. The data set contains 1,891 observations from the 1992 NHANES I Epidemiologic Followup study (NHEFS) vital and tracing status.

Suppose you want to fit a proportional hazards model to this data and construct a test for the proportional hazards assumption on gender. The following statements request a proportional hazards regression of age on gender and x, where the time-dependent covariate x is created using the programing statements. The explanatory variable x assumes the value of the time variable age for the male subgroup. The variable vitalstatus is the censor indicator, and a value of 1, 4, 5, or 6 indicates a censored observation. The WEIGHT statement specifies the sampling weight, and the CLASS statement specifies that gender is a classification variable.

proc surveyphreg data = mortality nomcar;
   class gender;
   strata varstrata;
   cluster varpsu;       
   weight sweight;
   model age*vitalstatus(1 4 5 6) = gender x;
   x = age*(gender=1);
run;

Output 89.5.1 displays some summary information. The "Number of Observations," "Censored Summary," and "Weighted Censored Summary" tables are exactly the same as in the example discussed in Domain Analysis.

Output 89.5.1 Data Summary, Censored Summary, and Information about Variance Estimation
The SURVEYPHREG Procedure

Number of Observations Read 1891
Number of Observations Used 1891
Sum of Weights Read 1.0298E8
Sum of Weights Used 1.0298E8

Summary of the Number of Event and Censored
Values
Total Event Censored Percent
Censored
1891 717 1174 62.08

Summary of the Weighted Number of Event
and Censored Values
Total Event Censored Percent
Censored
1.0298E8 27650348 75328323 73.15

Variance Estimation
Method Taylor Series
Missing Values NOMCAR

Output 89.5.2 displays the estimated regression coefficients and their standard errors. The variable gender has two levels, and only one level is estimable. By default, PROC SURVEYPHREG estimates the first level (GENDER 1) and assigns a zero value for the second level. The estimated regression coefficient is 1.61 with a standard error of 5.86. The estimated regression coefficient for x is –0.02 with a standard error of 0.08. The t statistic for x is –0.19 with a p-value of 0.85 on 33 degrees of freedom. This test suggests that an interaction between the time variable age and gender is not significant. Therefore, there is little evidence of an exponential trend over time in the hazard ratio for gender.

Output 89.5.2 Parameter Estimates
Analysis of Maximum Likelihood Estimates
Parameter DF Estimate Standard Error t Value Pr > |t| Hazard
Ratio
GENDER 1 33 1.605505 5.859600 0.27 0.7858 4.980
GENDER 2 33 0 . . . 1.000
x 33 -0.015648 0.082101 -0.19 0.8500 0.984