Model Selection: The GLMSELECT Procedure

PROC GLMSELECT performs model selection in the framework of general linear models. A variety of model selection methods are available, including forward, backward, stepwise, the LASSO method of Tibshirani (1996), and the related least angle regression method of Efron et al. (2004). The GLMSELECT procedure offers extensive capabilities for customizing the selection by providing a wide variety of selection and stopping criteria, including significance level–based and validation-based criteria. The procedure also provides graphical summaries of the selection process.

PROC GLMSELECT compares most closely with PROC REG and PROC GLM. PROC REG supports a variety of model selection methods but does not provide a CLASS statement. PROC GLM provides a CLASS statement but does not provide model selection methods. PROC GLMSELECT fills this gap. PROC GLMSELECT focuses on the standard general linear model for univariate responses with independently and identically distributed errors. PROC GLMSELECT provides results (tables, output data sets, and macro variables) that make it easy to explore the selected model in more detail in a subsequent procedure such as PROC REG or PROC GLM.