FOCUS AREAS

SAS/STAT Topics

SAS/STAT Software

HPPLS


The HPPLS procedure is a high-performance version of the PLS procedure in SAS/STAT software, which fits models by using any one of a number of linear predictive methods, including partial least squares (PLS). Ordinary least squares regression, as implemented in SAS/STAT procedures such as the GLM and REG procedures, has the single goal of minimizing sample response prediction error, and it seeks linear functions of the predictors (factors) that explain as much variation in each response as possible. The HPPLS procedure implements techniques that have the additional goal of accounting for variation in the predictors, under the assumption that directions in the predictor space that are well sampled should provide better prediction for new observations when the predictors are highly correlated. PROC HPPLS runs in either single-machine or distributed mode. The procedure enables you to do the following:

  • performs principal components regression, which extracts factors to explain as much predictor sample variation as possible
  • performs reduced rank regression, which extracts factors to explain as much response variation as possible
  • performs partial least squares regression method, which balances the two objectives of explaining response variation and explaining predictor variation:
    • the original predictive method of Wold (1966)
    • SIMPLS method of de Jong (1993)
  • allows the choice of the number of extracted factors by cross validation
  • offers the general linear modeling approach of the GLM procedure to specify a model for a design, allowing for general polynomial effects as well as classification or ANOVA effects
  • partition observations in the input data set into disjoint subsets for model training and testing
  • supports BY group processing, which allows separate analyses on grouped observations
  • creates an output data set to receive quantities that can be computed for every input observation, such as extracted factors and predicted values
  • specify performance options for multithreaded and distributed computing

For further details see the SAS/STAT User's Guide: The HPPLS Procedure
( PDF | HTML )

Examples