The HPLOGISTIC Procedure

SELECTION Statement

SELECTION <options> ;

The SELECTION statement performs model selection by examining whether effects should be added to or removed from the model according to rules defined by model selection methods. The statement is fully documented in the section SELECTION Statement in Chapter 4: Shared Statistical Concepts.

The HPLOGISTIC procedure supports the following effect-selection methods in the SELECTION statement:

METHOD=NONE

results in no model selection. This method fits the full model.

METHOD=FORWARD

performs forward selection. This method starts with no effects in the model and adds effects.

METHOD=BACKWARD

performs backward elimination. This method starts with all effects in the model and deletes effects.

METHOD=BACKWARD(FAST)

performs fast backward elimination. This method starts with all effects in the model and deletes effects without refitting the model.

METHOD=STEPWISE

performs stepwise regression. This method is similar to the FORWARD method except that effects already in the model do not necessarily stay there.

The only effect-selection criterion supported by the HPLOGISTIC procedure is SELECT=SL, where effects enter and leave the model based on an evaluation of the significance level. To determine this level of significance for each candidate effect, the HPLOGISTIC procedure calculates an approximate chi-square score test statistic.

The default criterion for the CHOOSE= and STOP= options in the SELECT statement is the significance level of the score test. The following criteria can be specified:

AIC

Akaike’s information criterion (Akaike, 1974)

AICC

a small-sample bias corrected version of Akaike’s information criterion as promoted in, for example, Hurvich and Tsai (1989) and Burnham and Anderson (1998)

BIC | SBC

Schwarz’ Bayesian criterion (Schwarz, 1978)

SL

the significance level of the score test (STOP= only)

The calculation of the information criteria uses the following formulas, where $p$ denotes the number of effective parameters in the candidate model, $f$ denotes the number of frequencies used, and $l$ is the log likelihood evaluated at the converged estimates:

\begin{align*}  \mr {AIC} =&  -2 l + 2p \\ \mr {AICC} =&  \left\{ \begin{array}{ll} -2 l + 2 p f/(f-p-1) &  \mr {when } f > p+2 \cr -2 l + 2 p (p+2) &  \mr {otherwise} \end{array}\right. \\ \mr {BIC} =&  -2 l + p \log (f) \end{align*}

Note: If you use the fast backward elimination method, the –2 log likelihood, AIC, AICC, and BIC statistics are approximated at each step where the model is not refit, and hence do not match the values computed when that model is fit outside of the selection routine.

When you specify the DETAILS= option in the SELECTION statement, the HPLOGISTIC procedure produces the following:

DETAILS=SUMMARY

produces a summary table that shows the effect added or removed at each step along with the p-value. The summary table is produced by default if the DETAILS= option is not specified.

DETAILS=STEPS

produces a detailed listing of all candidates at each step and their ranking in terms of the significance level for entry into or removal from the model.

DETAILS=ALL

produces the preceding two tables and a table of selection details which displays fit statistics for the model at each step of the selection process and the approximate chi-square score statistic.