Krall, Uthoff, and Harley (1975) analyzed data from a study on multiple myeloma in which researchers treated 65 patients with alkylating agents. Of those patients, 48 died during the study and 17 survived. The following DATA step creates the data set Myeloma. The variable Time represents the survival time in months from diagnosis. The variable VStatus consists of two values, 0 and 1, indicating whether the patient was alive or dead, respectively, at the end of the study. If the value of VStatus is 0, the corresponding value of Time is censored. The variables thought to be related to survival are LogBUN (log(BUN) at diagnosis), HGB (hemoglobin at diagnosis), Platelet (platelets at diagnosis: 0=abnormal, 1=normal), Age (age at diagnosis, in years), LogWBC (log(WBC) at diagnosis), Frac (fractures at diagnosis: 0=none, 1=present), LogPBM (log percentage of plasma cells in bone marrow), Protein (proteinuria at diagnosis), and SCalc (serum calcium at diagnosis). Interest lies in identifying important prognostic factors from these nine explanatory variables.
data Myeloma; input Time VStatus LogBUN HGB Platelet Age LogWBC Frac LogPBM Protein SCalc; label Time='Survival Time' VStatus='0=Alive 1=Dead'; datalines; 1.25 1 2.2175 9.4 1 67 3.6628 1 1.9542 12 10 1.25 1 1.9395 12.0 1 38 3.9868 1 1.9542 20 18 2.00 1 1.5185 9.8 1 81 3.8751 1 2.0000 2 15 2.00 1 1.7482 11.3 0 75 3.8062 1 1.2553 0 12 2.00 1 1.3010 5.1 0 57 3.7243 1 2.0000 3 9 3.00 1 1.5441 6.7 1 46 4.4757 0 1.9345 12 10 5.00 1 2.2355 10.1 1 50 4.9542 1 1.6628 4 9 5.00 1 1.6812 6.5 1 74 3.7324 0 1.7324 5 9 6.00 1 1.3617 9.0 1 77 3.5441 0 1.4624 1 8 6.00 1 2.1139 10.2 0 70 3.5441 1 1.3617 1 8 6.00 1 1.1139 9.7 1 60 3.5185 1 1.3979 0 10 6.00 1 1.4150 10.4 1 67 3.9294 1 1.6902 0 8 7.00 1 1.9777 9.5 1 48 3.3617 1 1.5682 5 10 7.00 1 1.0414 5.1 0 61 3.7324 1 2.0000 1 10 7.00 1 1.1761 11.4 1 53 3.7243 1 1.5185 1 13 9.00 1 1.7243 8.2 1 55 3.7993 1 1.7404 0 12 11.00 1 1.1139 14.0 1 61 3.8808 1 1.2788 0 10 11.00 1 1.2304 12.0 1 43 3.7709 1 1.1761 1 9 11.00 1 1.3010 13.2 1 65 3.7993 1 1.8195 1 10 11.00 1 1.5682 7.5 1 70 3.8865 0 1.6721 0 12 11.00 1 1.0792 9.6 1 51 3.5051 1 1.9031 0 9 13.00 1 0.7782 5.5 0 60 3.5798 1 1.3979 2 10 14.00 1 1.3979 14.6 1 66 3.7243 1 1.2553 2 10 15.00 1 1.6021 10.6 1 70 3.6902 1 1.4314 0 11 16.00 1 1.3424 9.0 1 48 3.9345 1 2.0000 0 10 16.00 1 1.3222 8.8 1 62 3.6990 1 0.6990 17 10 17.00 1 1.2304 10.0 1 53 3.8808 1 1.4472 4 9 17.00 1 1.5911 11.2 1 68 3.4314 0 1.6128 1 10 18.00 1 1.4472 7.5 1 65 3.5682 0 0.9031 7 8 19.00 1 1.0792 14.4 1 51 3.9191 1 2.0000 6 15 19.00 1 1.2553 7.5 0 60 3.7924 1 1.9294 5 9 24.00 1 1.3010 14.6 1 56 4.0899 1 0.4771 0 9 25.00 1 1.0000 12.4 1 67 3.8195 1 1.6435 0 10 26.00 1 1.2304 11.2 1 49 3.6021 1 2.0000 27 11 32.00 1 1.3222 10.6 1 46 3.6990 1 1.6335 1 9 35.00 1 1.1139 7.0 0 48 3.6532 1 1.1761 4 10 37.00 1 1.6021 11.0 1 63 3.9542 0 1.2041 7 9 41.00 1 1.0000 10.2 1 69 3.4771 1 1.4771 6 10 41.00 1 1.1461 5.0 1 70 3.5185 1 1.3424 0 9 51.00 1 1.5682 7.7 0 74 3.4150 1 1.0414 4 13 52.00 1 1.0000 10.1 1 60 3.8573 1 1.6532 4 10 54.00 1 1.2553 9.0 1 49 3.7243 1 1.6990 2 10 58.00 1 1.2041 12.1 1 42 3.6990 1 1.5798 22 10 66.00 1 1.4472 6.6 1 59 3.7853 1 1.8195 0 9 67.00 1 1.3222 12.8 1 52 3.6435 1 1.0414 1 10 88.00 1 1.1761 10.6 1 47 3.5563 0 1.7559 21 9 89.00 1 1.3222 14.0 1 63 3.6532 1 1.6232 1 9 92.00 1 1.4314 11.0 1 58 4.0755 1 1.4150 4 11 4.00 0 1.9542 10.2 1 59 4.0453 0 0.7782 12 10 4.00 0 1.9243 10.0 1 49 3.9590 0 1.6232 0 13 7.00 0 1.1139 12.4 1 48 3.7993 1 1.8573 0 10 7.00 0 1.5315 10.2 1 81 3.5911 0 1.8808 0 11 8.00 0 1.0792 9.9 1 57 3.8325 1 1.6532 0 8 12.00 0 1.1461 11.6 1 46 3.6435 0 1.1461 0 7 11.00 0 1.6128 14.0 1 60 3.7324 1 1.8451 3 9 12.00 0 1.3979 8.8 1 66 3.8388 1 1.3617 0 9 13.00 0 1.6628 4.9 0 71 3.6435 0 1.7924 0 9 16.00 0 1.1461 13.0 1 55 3.8573 0 0.9031 0 9 19.00 0 1.3222 13.0 1 59 3.7709 1 2.0000 1 10 19.00 0 1.3222 10.8 1 69 3.8808 1 1.5185 0 10 28.00 0 1.2304 7.3 1 82 3.7482 1 1.6721 0 9 41.00 0 1.7559 12.8 1 72 3.7243 1 1.4472 1 9 53.00 0 1.1139 12.0 1 66 3.6128 1 2.0000 1 11 57.00 0 1.2553 12.5 1 66 3.9685 0 1.9542 0 11 77.00 0 1.0792 14.0 1 60 3.6812 0 0.9542 0 12 ;
The stepwise selection process consists of a series of alternating forward selection and backward elimination steps. The former adds variables to the model, while the latter removes variables from the model.
The following statements use PROC PHREG to produce a stepwise regression analyis. Stepwise selection is requested by specifying the SELECTION=STEPWISE option in the MODEL statement. The option SLENTRY=0.25 specifies that a variable has to be significant at the 0.25 level before it can be entered into the model, while the option SLSTAY=0.15 specifies that a variable in the model has to be significant at the 0.15 level for it to remain in the model. The DETAILS option requests detailed results for the variable selection process.
proc phreg data=Myeloma; model Time*VStatus(0)=LogBUN HGB Platelet Age LogWBC Frac LogPBM Protein SCalc / selection=stepwise slentry=0.25 slstay=0.15 details; run;
Results of the stepwise regression analysis are displayed in Output 66.1.1 through Output 66.1.7.
Individual score tests are used to determine which of the nine explanatory variables is first selected into the model. In this case, the score test for each variable is the global score test for the model containing that variable as the only explanatory variable. Output 66.1.1 displays the chi-square statistics and the corresponding p-values. The variable LogBUN has the largest chi-square value (8.5164), and it is significant (p=0.0035) at the SLENTRY=0.25 level. The variable LogBUN is thus entered into the model.
Model Information | ||
---|---|---|
Data Set | WORK.MYELOMA | |
Dependent Variable | Time | Survival Time |
Censoring Variable | VStatus | 0=Alive 1=Dead |
Censoring Value(s) | 0 | |
Ties Handling | BRESLOW |
Summary of the Number of Event and Censored Values |
|||
---|---|---|---|
Total | Event | Censored | Percent Censored |
65 | 48 | 17 | 26.15 |
Analysis of Effects Eligible for Entry |
|||
---|---|---|---|
Effect | DF | Score Chi-Square |
Pr > ChiSq |
LogBUN | 1 | 8.5164 | 0.0035 |
HGB | 1 | 5.0664 | 0.0244 |
Platelet | 1 | 3.1816 | 0.0745 |
Age | 1 | 0.0183 | 0.8924 |
LogWBC | 1 | 0.5658 | 0.4519 |
Frac | 1 | 0.9151 | 0.3388 |
LogPBM | 1 | 0.5846 | 0.4445 |
Protein | 1 | 0.1466 | 0.7018 |
SCalc | 1 | 1.1109 | 0.2919 |
Residual Chi-Square Test | ||
---|---|---|
Chi-Square | DF | Pr > ChiSq |
18.4550 | 9 | 0.0302 |
Output 66.1.2 displays the results of the first model. Since the Wald chi-square statistic is significant () at the SLSTAY=0.15 level, LogBUN stays in the model.
Convergence Status |
---|
Convergence criterion (GCONV=1E-8) satisfied. |
Model Fit Statistics | ||
---|---|---|
Criterion | Without Covariates |
With Covariates |
-2 LOG L | 309.716 | 301.959 |
AIC | 309.716 | 303.959 |
SBC | 309.716 | 305.830 |
Testing Global Null Hypothesis: BETA=0 | |||
---|---|---|---|
Test | Chi-Square | DF | Pr > ChiSq |
Likelihood Ratio | 7.7572 | 1 | 0.0053 |
Score | 8.5164 | 1 | 0.0035 |
Wald | 8.3392 | 1 | 0.0039 |
Analysis of Maximum Likelihood Estimates | ||||||
---|---|---|---|---|---|---|
Parameter | DF | Parameter Estimate |
Standard Error |
Chi-Square | Pr > ChiSq | Hazard Ratio |
LogBUN | 1 | 1.74595 | 0.60460 | 8.3392 | 0.0039 | 5.731 |
The next step consists of selecting another variable to add to the model. Output 66.1.3 displays the chi-square statistics and p-values of individual score tests (adjusted for LogBUN) for the remaining eight variables. The score chi-square for a given variable is the value of the likelihood score test for testing the significance of the variable in the presence of LogBUN. The variable HGB is selected because it has the highest chi-square value (4.3468), and it is significant () at the SLENTRY=0.25 level.
Analysis of Effects Eligible for Entry |
|||
---|---|---|---|
Effect | DF | Score Chi-Square |
Pr > ChiSq |
HGB | 1 | 4.3468 | 0.0371 |
Platelet | 1 | 2.0183 | 0.1554 |
Age | 1 | 0.7159 | 0.3975 |
LogWBC | 1 | 0.0704 | 0.7908 |
Frac | 1 | 1.0354 | 0.3089 |
LogPBM | 1 | 1.0334 | 0.3094 |
Protein | 1 | 0.5214 | 0.4703 |
SCalc | 1 | 1.4150 | 0.2342 |
Residual Chi-Square Test | ||
---|---|---|
Chi-Square | DF | Pr > ChiSq |
9.3164 | 8 | 0.3163 |
Output 66.1.4 displays the fitted model containing both LogBUN and HGB. Based on the Wald statistics, neither LogBUN nor HGB is removed from the model.
Convergence Status |
---|
Convergence criterion (GCONV=1E-8) satisfied. |
Model Fit Statistics | ||
---|---|---|
Criterion | Without Covariates |
With Covariates |
-2 LOG L | 309.716 | 297.767 |
AIC | 309.716 | 301.767 |
SBC | 309.716 | 305.509 |
Testing Global Null Hypothesis: BETA=0 | |||
---|---|---|---|
Test | Chi-Square | DF | Pr > ChiSq |
Likelihood Ratio | 11.9493 | 2 | 0.0025 |
Score | 12.7252 | 2 | 0.0017 |
Wald | 12.1900 | 2 | 0.0023 |
Analysis of Maximum Likelihood Estimates | ||||||
---|---|---|---|---|---|---|
Parameter | DF | Parameter Estimate |
Standard Error |
Chi-Square | Pr > ChiSq | Hazard Ratio |
LogBUN | 1 | 1.67440 | 0.61209 | 7.4833 | 0.0062 | 5.336 |
HGB | 1 | -0.11899 | 0.05751 | 4.2811 | 0.0385 | 0.888 |
Output 66.1.5 shows Step 3 of the selection process, in which the variable SCalc is added, resulting in the model with LogBUN, HGB, and SCalc as the explanatory variables. Note that SCalc has the smallest Wald chi-square statistic, and it is not significant () at the SLSTAY=0.15 level.
Convergence Status |
---|
Convergence criterion (GCONV=1E-8) satisfied. |
Model Fit Statistics | ||
---|---|---|
Criterion | Without Covariates |
With Covariates |
-2 LOG L | 309.716 | 296.078 |
AIC | 309.716 | 302.078 |
SBC | 309.716 | 307.692 |
Testing Global Null Hypothesis: BETA=0 | |||
---|---|---|---|
Test | Chi-Square | DF | Pr > ChiSq |
Likelihood Ratio | 13.6377 | 3 | 0.0034 |
Score | 15.3053 | 3 | 0.0016 |
Wald | 14.4542 | 3 | 0.0023 |
Analysis of Maximum Likelihood Estimates | ||||||
---|---|---|---|---|---|---|
Parameter | DF | Parameter Estimate |
Standard Error |
Chi-Square | Pr > ChiSq | Hazard Ratio |
LogBUN | 1 | 1.63593 | 0.62359 | 6.8822 | 0.0087 | 5.134 |
HGB | 1 | -0.12643 | 0.05868 | 4.6419 | 0.0312 | 0.881 |
SCalc | 1 | 0.13286 | 0.09868 | 1.8127 | 0.1782 | 1.142 |
The variable SCalc is then removed from the model in a step-down phase in Step 4 (Output 66.1.6). The removal of SCalc brings the stepwise selection process to a stop in order to avoid repeatedly entering and removing the same variable.
Convergence Status |
---|
Convergence criterion (GCONV=1E-8) satisfied. |
Model Fit Statistics | ||
---|---|---|
Criterion | Without Covariates |
With Covariates |
-2 LOG L | 309.716 | 297.767 |
AIC | 309.716 | 301.767 |
SBC | 309.716 | 305.509 |
Testing Global Null Hypothesis: BETA=0 | |||
---|---|---|---|
Test | Chi-Square | DF | Pr > ChiSq |
Likelihood Ratio | 11.9493 | 2 | 0.0025 |
Score | 12.7252 | 2 | 0.0017 |
Wald | 12.1900 | 2 | 0.0023 |
Analysis of Maximum Likelihood Estimates | ||||||
---|---|---|---|---|---|---|
Parameter | DF | Parameter Estimate |
Standard Error |
Chi-Square | Pr > ChiSq | Hazard Ratio |
LogBUN | 1 | 1.67440 | 0.61209 | 7.4833 | 0.0062 | 5.336 |
HGB | 1 | -0.11899 | 0.05751 | 4.2811 | 0.0385 | 0.888 |
Note: | Model building terminates because the effect to be entered is the effect that was removed in the last step. |
The procedure also displays a summary table of the steps in the stepwise selection process, as shown in Output 66.1.7.
Summary of Stepwise Selection | |||||||
---|---|---|---|---|---|---|---|
Step | Effect | DF | Number In |
Score Chi-Square |
Wald Chi-Square |
Pr > ChiSq | |
Entered | Removed | ||||||
1 | LogBUN | 1 | 1 | 8.5164 | 0.0035 | ||
2 | HGB | 1 | 2 | 4.3468 | 0.0371 | ||
3 | SCalc | 1 | 3 | 1.8225 | 0.1770 | ||
4 | SCalc | 1 | 2 | 1.8127 | 0.1782 |
The stepwise selection process results in a model with two explanatory variables, LogBUN and HGB.