Example 93.5 A Test of the Proportional Hazards Assumption by Using the Programming Statements

You can use programming statements in PROC SURVEYPHREG to create time-dependent covariates to test the proportional hazards assumption for complex survey data. Consider the data set mortality from Example 93.3. The data set contains 1,891 observations from the 1992 NHANES I Epidemiologic Followup study (NHEFS) vital and tracing status.

Suppose you want to fit a proportional hazards model to this data and construct a test for the proportional hazards assumption on gender. The following statements request a proportional hazards regression of age on gender and x, where the time-dependent covariate x is created using the programing statements. The explanatory variable x assumes the value of the time variable age for the male subgroup. The variable vitalstatus is the censor indicator, and a value of 1, 4, 5, or 6 indicates a censored observation. The WEIGHT statement specifies the sampling weight, and the CLASS statement specifies that gender is a classification variable.

proc surveyphreg data = mortality nomcar;
   class gender;
   strata varstrata;
   cluster varpsu;       
   weight sweight;
   model age*vitalstatus(1 4 5 6) = gender x;
   x = age*(gender=1);

Output 93.5.1 displays some summary information. The Number of Observations, Censored Summary, and Weighted Censored Summary tables are exactly the same as in the example discussed in Domain Analysis.

Output 93.5.1: Data Summary, Censored Summary, and Information about Variance Estimation


Number of Observations Read 1891
Number of Observations Used 1891
Sum of Weights Read 1.0298E8
Sum of Weights Used 1.0298E8

Summary of the Number of Event and Censored
Total Event Censored Percent
1891 717 1174 62.08

Summary of the Weighted Number of Event
and Censored Values
Total Event Censored Percent
1.0298E8 27650348 75328323 73.15

Variance Estimation
Method Taylor Series
Missing Values NOMCAR

Output 93.5.2 displays the estimated regression coefficients and their standard errors. The variable gender has two levels, and only one level is estimable. By default, PROC SURVEYPHREG estimates the first level (GENDER 1) and assigns a zero value for the second level. The estimated regression coefficient is 1.61 with a standard error of 0.71. The estimated regression coefficient for x is –0.02 with a standard error of 0.01. The t statistic for x is –1.55 with a p-value of 0.13 on 33 degrees of freedom. This test suggests that an interaction between the time variable age and gender is not significant. Therefore, there is little evidence of an exponential trend over time in the hazard ratio for gender.

Output 93.5.2: Parameter Estimates

Analysis of Maximum Likelihood Estimates
Parameter DF Estimate Standard Error t Value Pr > |t| Hazard
GENDER 1 33 1.605505 0.709269 2.26 0.0303 4.980
GENDER 2 33 0 . . . 1.000
x 33 -0.015648 0.010082 -1.55 0.1302 0.984