The main features of the HPREG procedure are as follows:
Model specification
supports GLM and reference parameterization for classification effects
supports any degree of interaction (crossed effects) and nested effects
supports hierarchy among effects
supports partitioning of data into training, validation, and testing roles
supports a FREQ statement for grouped analysis
supports a WEIGHT statement for weighted analysis
Selection control
provides multiple effect-selection methods
enables selection from a very large number of effects (tens of thousands)
offers selection of individual levels of classification effects
provides effect selection based on a variety of selection criteria
provides stopping rules based on a variety of model evaluation criteria
supports stopping and selection rules based on external validation and leave-one-out cross validation
Display and output
produces output data sets that contain predicted values, residuals, studentized residuals, confidence limits, and influence statistics
The HPREG procedure supports the following effect selection methods. For a more detailed description of these methods, see the section Methods in Chapter 3: Shared Statistical Concepts.
The forward selection method starts with no effects in the model and adds effects.
The backward elimination method starts with all effects in the model and deletes effects.
The stepwise regression method is similar to the FORWARD method except that effects already in the model do not necessarily stay there.
The forward swap selection method is a modification of forward selection where before any addition step, all pairwise swaps of effects in and out of the current model that improve the selection criterion are made. When the selection criterion is R square, this method coincides with the MAXR method in the REG procedure in SAS/STAT software.
The least angle regression method, like forward selection, starts with no effects in the model and adds effects. The parameter estimates at any step are “shrunk” when compared to the corresponding least squares estimates.
The lasso method adds and deletes parameters based on a version of ordinary least squares in which the sum of the absolute regression coefficients is constrained. PROC HPREG also supports adaptive lasso selection where weights are applied to each of the parameters in forming the lasso constraint.
Hybrid versions of LAR and LASSO methods are also supported. They use LAR or LASSO to select the model, but then estimate the regression coefficients by ordinary weighted least squares.
Because the HPREG procedure is a high-performance analytical procedure, it also does the following:
enables you to run in distributed mode on a cluster of machines that distribute the data and the computations
enables you to run in single-machine mode on the server where SAS is installed
exploits all the available cores and concurrent threads, regardless of execution mode
For more information, see the section Processing Modes in Chapter 2: Shared Concepts and Topics.