The SURVEYPHREG Procedure

Time and CLASS Variables Usage

The following DATA step creates an artificial data set, Test, to be used in this section. There are six variables in Test: the variable T contains the failure times; the variable Status is the censoring indicator variable with the value 1 for an uncensored failure time and the value 0 for a censored time; the variable A is a categorical variable with values 1, 2, and 3 representing three different categories; the variable MirrorT is an exact copy of T; the variable W is the observation weight; and the variable S is the strata indicator.

data Test;
   input T Status A W S @@;
   MirrorT = T;
   datalines;
 23    1    1   10   1    7    0   1   20   2
 23    1    1   10   1   10    1   1   20   2
 20    0    1   10   1   13    0   1   20   2
 24    1    1   10   1   10    1   1   20   2
 18    1    2   10   1    6    1   2   20   2
 18    0    2   10   1    6    1   2   20   2
 13    0    2   10   1   13    1   2   20   2
  9    0    2   10   1   15    1   2   20   2
  8    1    3   10   1    6    1   3   20   2
 12    0    3   10   1    4    1   3   20   2
 11    1    3   10   1    8    1   1   20   2
  6    1    3   10   1    7    1   3   20   2
  7    1    3   10   1   12    1   3   20   2
  9    1    2   10   1   15    1   2   20   2
  3    1    2   10   1   14    0   3   20   2
  6    1    1   10   1   13    1   2   20   2
;

Time Variable on the Right Side of the MODEL Statement

The time variable cannot be used explicitly as an explanatory effect in the MODEL statement. The following statements produce an error message:

proc surveyphreg data=Test;
   weight W;
   strata S;                     
   class A;
   model T*Status(0)=T*A;
run;

To use the time variable as an explanatory effect, replace T by MirrorT as an effect, which is an exact copy of T, as in the following statements:

proc surveyphreg data=Test;
   weight W;
   strata S;
   class A;
   model T*Status(0)=A*MirrorT;
run;

Note that neither T*A nor MirrorT*A in the MODEL statement is time-dependent. The results of fitting this model are shown in Figure 89.3.

Figure 89.3 T*A Effect

The SURVEYPHREG Procedure

Analysis of Maximum Likelihood Estimates
Parameter	DF	Estimate	Standard Error	t Value	Pr > \|t\|	Hazard Ratio
MirrorT*A 1	30	-17.560700	57689160	-0.00	1.0000	0.000
MirrorT*A 2	30	-17.424235	57689159	-0.00	1.0000	0.000
MirrorT*A 3	30	-17.448673	57689160	-0.00	1.0000	0.000

CLASS Variables and Programming Statements

In PROC SURVEYPHREG, the levels of CLASS variables are determined by the CLASS statement and the input data and are not affected by user-supplied programming statements. Consider the following statements, which produce the results in Figure 89.4. Variable A is declared as a CLASS variable in the CLASS statement.

proc surveyphreg data=Test;
   weight W;
   strata S;   
   class A;
   model T*Status(0)=A;
run;

Figure 89.4 shows the parameters that correspond to A and their respective regression coefficients estimates.

Figure 89.4 Design Variable and Regression Coefficient Estimates

The SURVEYPHREG Procedure

Class Level Information
Class	Levels	Values
A	3	1 2 3

Analysis of Maximum Likelihood Estimates
Parameter	DF	Estimate	Standard Error	t Value	Pr > \|t\|	Hazard Ratio
A 1	30	-1.162184	0.655136	-1.77	0.0862	0.313
A 2	30	-0.616962	0.521841	-1.18	0.2464	0.540
A 3	30	0	.	.	.	1.000

Now consider the programming statement that attempts to change the value of the CLASS variable A as in the following specification:

proc surveyphreg data=Test;
   weight W;
   strata S;   
   class A;
   model T*Status(0)=A;
   if A=3 then A=2;
run;

Results of this analysis are shown in Figure 89.5 and are identical to those in Figure 89.4. The if A=3 then A=2 programming statement has no effect on the explanatory variable for A, which have already been determined.

Figure 89.5 Design Variable and Regression Coefficient Estimates

The SURVEYPHREG Procedure

Class Level Information
Class	Levels	Values
A	3	1 2 3

Analysis of Maximum Likelihood Estimates
Parameter	DF	Estimate	Standard Error	t Value	Pr > \|t\|	Hazard Ratio
A 1	30	-1.162184	0.655136	-1.77	0.0862	0.313
A 2	30	-0.616962	0.521841	-1.18	0.2464	0.540
A 3	30	0	.	.	.	1.000

Additionally any variable used in a programming statement that has already been declared in the CLASS statement is not treated as a collection of the corresponding design variables. Consider the following statements:

proc surveyphreg data=Test;
   class A;
   model T*Status(0)=A X;
   X=T*A;
run;

The CLASS variable A generates two design variables as explanatory variables. The variable X created by the X=T*A programming statement is a single time-dependent covariate whose values are evaluated using the exact values of A given in the data, not the dummy coded values that represent A. In the data set Test, A has the values of 1, 2, and 3, and these values are multiplied by the values of T to produce X. If A were a character variable with values 'Bird', 'Cat', and 'Dog', the programming statement X=T*A would have produced an error in the attempt to multiply a number with a character value.

Figure 89.6 Single Time-Dependent Variable X*A

The SURVEYPHREG Procedure

Analysis of Maximum Likelihood Estimates
Parameter	DF	Estimate	Standard Error	t Value	Pr > \|t\|	Hazard Ratio
A 1	31	0.158010	95.546316	0.00	0.9987	1.171
A 2	31	0.008993	43.630439	0.00	0.9998	1.009
A 3	31	0	.	.	.	1.000
X	31	0.092679	5.905522	0.02	0.9876	1.097

The following statements are not the same as in the preceding program. If you want to create time-dependent covariates from the values of a CLASS variable, you could use syntax like the following:

proc surveyphreg data=Test;
   class A;
   model T*Status(0)=A X1 X2;
   X1= T*(A=1); 
   X2= T*(A=2);
run;

The Boolean parenthetical expressions (A=1) and (A=2) resolve to a value of 1 or 0, depending on whether the expression is true or false, respectively.

Results of this test are shown in Figure 89.7.

Figure 89.7 Simple Test of Proportional Hazards Assumption

The SURVEYPHREG Procedure

Analysis of Maximum Likelihood Estimates
Parameter	DF	Estimate	Standard Error	t Value	Pr > \|t\|	Hazard Ratio
A 1	31	-0.007655	5.411713	-0.00	0.9989	0.992
A 2	31	-0.881383	4.263923	-0.21	0.8376	0.414
A 3	31	0	.	.	.	1.000
X1	31	-0.155220	0.602329	-0.26	0.7983	0.856
X2	31	0.011554	0.454220	0.03	0.9799	1.012

In general, when your model contains a categorical explanatory variable that is time-dependent, it might be necessary to use hardcoded dummy variables to represent the categories of the categorical variable.