You can use programming statements in PROC SURVEYPHREG to create time-dependent covariates to test the proportional hazards assumption for complex survey data. Consider the data set mortality from Example 89.3. The data set contains 1,891 observations from the 1992 NHANES I Epidemiologic Followup study (NHEFS) vital and tracing status.
Suppose you want to fit a proportional hazards model to this data and construct a test for the proportional hazards assumption on gender. The following statements request a proportional hazards regression of age on gender and x, where the time-dependent covariate x is created using the programing statements. The explanatory variable x assumes the value of the time variable age for the male subgroup. The variable vitalstatus is the censor indicator, and a value of 1, 4, 5, or 6 indicates a censored observation. The WEIGHT statement specifies the sampling weight, and the CLASS statement specifies that gender is a classification variable.
proc surveyphreg data = mortality nomcar; class gender; strata varstrata; cluster varpsu; weight sweight; model age*vitalstatus(1 4 5 6) = gender x; x = age*(gender=1); run;
Output 89.5.1 displays some summary information. The "Number of Observations," "Censored Summary," and "Weighted Censored Summary" tables are exactly the same as in the example discussed in Domain Analysis.
Number of Observations Read | 1891 |
---|---|
Number of Observations Used | 1891 |
Sum of Weights Read | 1.0298E8 |
Sum of Weights Used | 1.0298E8 |
Summary of the Number of Event and Censored Values |
|||
---|---|---|---|
Total | Event | Censored | Percent Censored |
1891 | 717 | 1174 | 62.08 |
Summary of the Weighted Number of Event and Censored Values |
|||
---|---|---|---|
Total | Event | Censored | Percent Censored |
1.0298E8 | 27650348 | 75328323 | 73.15 |
Variance Estimation | |
---|---|
Method | Taylor Series |
Missing Values | NOMCAR |
Output 89.5.2 displays the estimated regression coefficients and their standard errors. The variable gender has two levels, and only one level is estimable. By default, PROC SURVEYPHREG estimates the first level (GENDER 1) and assigns a zero value for the second level. The estimated regression coefficient is 1.61 with a standard error of 5.86. The estimated regression coefficient for x is –0.02 with a standard error of 0.08. The t statistic for x is –0.19 with a p-value of 0.85 on 33 degrees of freedom. This test suggests that an interaction between the time variable age and gender is not significant. Therefore, there is little evidence of an exponential trend over time in the hazard ratio for gender.
Analysis of Maximum Likelihood Estimates | ||||||
---|---|---|---|---|---|---|
Parameter | DF | Estimate | Standard Error | t Value | Pr > |t| | Hazard Ratio |
GENDER 1 | 33 | 1.605505 | 5.859600 | 0.27 | 0.7858 | 4.980 |
GENDER 2 | 33 | 0 | . | . | . | 1.000 |
x | 33 | -0.015648 | 0.082101 | -0.19 | 0.8500 | 0.984 |