The HPLOGISTIC Procedure

SELECTION Statement

  • SELECTION <options>;

The SELECTION statement performs model selection by examining whether effects should be added to or removed from the model according to rules that are defined by model selection methods. The statement is fully documented in the section SELECTION Statement in Chapter 4: Shared Statistical Concepts.

The HPLOGISTIC procedure supports the following effect-selection methods in the SELECTION statement:

METHOD=NONE

results in no model selection. This method fits the full model.

METHOD=FORWARD

performs forward selection. This method starts with no effects in the model and adds effects.

METHOD=BACKWARD

performs backward elimination. This method starts with all effects in the model and deletes effects.

METHOD=BACKWARD(FAST)

performs fast backward elimination when SELECT=SL . This method starts with all effects in the model and deletes effects without refitting the model.

METHOD=STEPWISE

performs stepwise regression. This method is similar to the FORWARD method except that effects already in the model do not necessarily stay there.

The default criterion for the SELECT= , CHOOSE= , and STOP= options in the SELECTION statement is the significance level (SL), where effects enter and leave the model based on the significance level of an approximate chi-square test statistic. You can specify the following criteria in the SELECT= , CHOOSE= , and STOP= options:

AIC

uses Akaike’s information criterion (Akaike, 1974)

AICC

uses a small-sample bias corrected version of Akaike’s information criterion, as promoted in Hurvich and Tsai (1989) and Burnham and Anderson (1998), for example

BIC | SBC

uses Schwarz’ Bayesian criterion (Schwarz, 1978)

SL

uses the significance level of the score test as the criterion (not available for a CHOOSE= option)

VALIDATE

uses the average square error (ASE) that is computed on the VALIDATE partition as the criterion (not available for a SELECT= option)

For more information, see the section Information Criteria. If you specify the PARTITION statement, then the AIC, AICC, BIC, and SL statistics are computed on the training data set; otherwise they are computed on the full data set.

Note: If you use the fast backward elimination method, the –2 log likelihood, AIC, AICC, and BIC statistics are approximated at each step where the model is not refit, and hence do not match the values that are computed when that model is fit outside of the selection routine. Similarly, if you specify SELECT=AIC, AICC, or BIC, the selection criteria are estimated (Lawless and Singhal, 1978), and hence do not match the values that are computed when that model is fit outside of the selection routine.

When you specify the DETAILS= option in the SELECTION statement, the HPLOGISTIC procedure produces the following:

DETAILS=SUMMARY

produces a summary table that shows the effect that is added or removed at each step along with the p-value and the SELECT= , CHOOSE= , and STOP= criteria. The summary table is produced by default if the DETAILS= option is not specified.

DETAILS=STEPS

produces a detailed listing of all candidates at each step and their ranking in terms of the selection criterion for entry into or removal from the model.

DETAILS=ALL

produces the preceding two tables and a table of selection details, which displays fit statistics for the model at each step of the selection process and an approximate chi-square score or likelihood ratio statistic.