SAS/STAT Software

PLS Procedure

The PLS procedure fits models by using any one of a number of linear predictive methods including partial least squares (PLS). Ordinary least squares regression, as implemented in SAS/STAT procedures such as PROC GLM and PROC REG, has the single goal of minimizing sample response prediction error, seeking linear functions of the predictors that explain as much variation in each response as possible. The techniques implemented in the PLS procedure have the additional goal of accounting for variation in the predictors, under the assumption that directions in the predictor space that are well sampled should provide better prediction for new observations when the predictors are highly correlated. All of the techniques implemented in the PLS procedure work by extracting successive linear combinations of the predictors, called factors (also called components, latent vectors, or latent variables), which optimally address one or both of these two goals—explaining response variation and explaining predictor variation. In particular, the method of partial least squares balances the two objectives, seeking factors that explain both response and predictor variation. The following are highlights of the PLS procedure's features:

  • implements the following techniques:
    • principal components regression, which extracts factors to explain as much predictor sample variation as possible
    • reduced rank regression, which extracts factors to explain as much response variation as possible. This technique, also known as (maximum) redundancy analysis, differs from multivariate linear regression only when there are multiple responses.
    • partial least squares regression, which balances the two objectives of explaining response variation and explaining predictor variation. Two different formulations for partial least squares are available: the original predictive method of Wold (1966) and the SIMPLS method of de Jong (1993).
  • enables you to choose the number of extracted factors by cross validation
  • enables you to use the general linear modeling approach of the GLM procedure to specify a model for your design, allowing for general polynomial effects as well as classification or ANOVA effects
  • enables you to save the fitted model in a data set and apply it to new data by using the SCORE procedure
  • performs BY group processing, which enables you to obtain separate analyses on grouped observations
  • creates an output data set to receive quantities that can be computed for every input observation, such as extracted factors and predicted values
  • automatically creates graphs by using ODS Graphics

For further details see the PLS Procedure