The HPREG Procedure

PROC HPREG Features

The main features of the HPREG procedure are as follows:

  • Model specification

    • supports GLM and reference parameterization for classification effects

    • supports any degree of interaction (crossed effects) and nested effects

    • supports hierarchy among effects

    • supports partitioning of data into training, validation, and testing roles

    • supports a FREQ statement for grouped analysis

    • supports a WEIGHT statement for weighted analysis

  • Selection control

    • provides multiple effect-selection methods

    • enables selection from a very large number of effects (tens of thousands)

    • offers selection of individual levels of classification effects

    • provides effect selection based on a variety of selection criteria

    • provides stopping rules based on a variety of model evaluation criteria

    • supports stopping and selection rules based on external validation and leave-one-out cross validation

  • Display and output

    • produces output data sets that contain predicted values, residuals, studentized residuals, confidence limits, and influence statistics

The HPREG procedure supports the following effect selection methods. For a more detailed description of these methods, see the section Methods in Chapter 4: Shared Statistical Concepts.

  • Forward selection starts with no effects in the model and adds effects.

  • Backward elimination starts with all effects in the model and deletes effects.

  • Stepwise regression is similar to forward selection except that effects already in the model do not necessarily stay there.

  • Forward-swap selection is a modification of forward selection. Before any addition step, PROC HPREG makes all pairwise swaps of effects in and out of the current model that improve the selection criterion. When the selection criterion is R square, this method coincides with the MAXR method in the REG procedure in SAS/STAT software.

  • Least angle regression, like forward selection, starts with no effects in the model and adds effects. The parameter estimates at any step are shrunk when compared to the corresponding least squares estimates.

  • Lasso adds and deletes parameters based on a version of ordinary least squares in which the sum of the absolute regression coefficients is constrained. PROC HPREG also supports adaptive lasso selection where weights are applied to each of the parameters in forming the lasso constraint.

Hybrid versions of LAR and LASSO methods are also supported. They use LAR or LASSO to select the model, but then estimate the regression coefficients by ordinary weighted least squares.

Because the HPREG procedure is a high-performance analytical procedure, it also does the following:

  • enables you to run in distributed mode on a cluster of machines that distribute the data and the computations

  • enables you to run in single-machine mode on the server where SAS is installed

  • exploits all the available cores and concurrent threads, regardless of execution mode

For more information, see the section Processing Modes in Chapter 3: Shared Concepts and Topics.