### Example 93.5 A Test of the Proportional Hazards Assumption by Using the Programming Statements

You can use programming statements in PROC SURVEYPHREG to create time-dependent covariates to test the proportional hazards assumption for complex survey data. Consider the data set `mortality` from Example 93.3. The data set contains 1,891 observations from the 1992 NHANES I Epidemiologic Followup study (NHEFS) vital and tracing status.

Suppose you want to fit a proportional hazards model to this data and construct a test for the proportional hazards assumption on gender. The following statements request a proportional hazards regression of `age` on `gender` and `x`, where the time-dependent covariate `x` is created using the programing statements. The explanatory variable `x` assumes the value of the time variable `age` for the male subgroup. The variable `vitalstatus` is the censor indicator, and a value of 1, 4, 5, or 6 indicates a censored observation. The WEIGHT statement specifies the sampling weight, and the CLASS statement specifies that `gender` is a classification variable.

```proc surveyphreg data = mortality nomcar;
class gender;
strata varstrata;
cluster varpsu;
weight sweight;
model age*vitalstatus(1 4 5 6) = gender x;
x = age*(gender=1);
run;
```

Output 93.5.1 displays some summary information. The Number of Observations, Censored Summary, and Weighted Censored Summary tables are exactly the same as in the example discussed in Domain Analysis.

Output 93.5.1: Data Summary, Censored Summary, and Information about Variance Estimation

The SURVEYPHREG Procedure

 Number of Observations Read 1891 1891 1.0298e+08 1.0298e+08

Summary of the Number of Event and Censored
Values
Total Event Censored Percent
Censored
1891 717 1174 62.08

Summary of the Weighted Number of Event
and Censored Values
Total Event Censored Percent
Censored
1.0298E8 27650348 75328323 73.15

Variance Estimation
Method Taylor Series
Missing Values NOMCAR

Output 93.5.2 displays the estimated regression coefficients and their standard errors. The variable `gender` has two levels, and only one level is estimable. By default, PROC SURVEYPHREG estimates the first level (`GENDER 1`) and assigns a zero value for the second level. The estimated regression coefficient is 1.61 with a standard error of 0.71. The estimated regression coefficient for `x` is –0.02 with a standard error of 0.01. The t statistic for `x` is –1.55 with a p-value of 0.13 on 33 degrees of freedom. This test suggests that an interaction between the time variable `age` and `gender` is not significant. Therefore, there is little evidence of an exponential trend over time in the hazard ratio for `gender`.

Output 93.5.2: Parameter Estimates

Analysis of Maximum Likelihood Estimates
Parameter DF Estimate Standard Error t Value Pr > |t| Hazard
Ratio
GENDER 1 33 1.605505 0.709269 2.26 0.0303 4.980
GENDER 2 33 0 . . . 1.000
x 33 -0.015648 0.010082 -1.55 0.1302 0.984