SAS/STAT Software

Robust Regression

The main purpose of robust regression is to detect outliers and provide resistant (stable) results in the presence of outliers. In order to achieve this stability, robust regression limits the influence of outliers. Historically, robust regression techniques have addressed three classes of problems:

• problems with outliers in the Y direction (response direction)
• problems with multivariate outliers in the X space (that is, outliers in the covariate space, which are also referred to as leverage points)
• problems with outliers in both the Y direction and the X space

To address problems with outliers, SAS/STAT software provides the QUANTREG and QUANTSELECT procedures for quantile regression. Quantile regression is robust to extreme points in the response direction (outliers) but it is not robust to extreme points in the covariate space (leverage points). When both types of robustness are of concern, consider using the ROBUSTREG procedure, which provides the following four methods:

• M estimation, introduced by Huber (1973), which is the simplest approach both computationally and theoretically. Although it is not robust with respect to leverage points, it is still used extensively in data analysis when contamination can be assumed to be mainly in the response direction.
• Least trimmed squares (LTS) estimation, which is a high breakdown value method that was introduced by Rousseeuw (1984). The breakdown value is a measure of the proportion of contamination that an estimation method can withstand and still maintain its robustness.
• S estimation, which is a high breakdown value method that was introduced by Rousseeuw and Yohai (1984). Given the same breakdown value, S estimation has a higher statistical efficiency than LTS estimation.
• MM estimation, introduced by Yohai (1987), which combines high breakdown value estimation and M estimation. It has the same high breakdown property as S estimation but a higher statistical efficiency.

QUANTREG Procedure

The QUANTREG procedure uses quantile regression to model the effects of covariates on the conditional quantiles of a response variable. The following are highlights of the QUANTREG procedure's features:

 offers simplex, interior point, and smoothing algorithms for estimation provides sparsity, rank, and resampling methods for confidence intervals provides asymptotic and bootstrap methods for covariance and correlation matrices of the estimated parameters provides the Wald and likelihood ratio tests for the regression parameter estimates perform hypothesis tests for the estimable functions, construct confidence limits, and obtain specific nonlinear transformations enables you to construct special collections of columns for design matrices provides outlier and leverage-point diagnostics supports parallel computing when multiple processors are available provides row-wise or column-wise output data sets with multiple quantiles provides regression quantile spline fits automatically produces fit plots, diagnostic plots, and quantile process plots by using ODS Graphics performs BY group processing, whcih enables you to obtain separate analyses on grouped observations perform weighted estimation creates an output data set that contains predicted values, residuals, estimated standard errors, and other statistics creates an output data set that contains the parameter estimates for all quantiles create a SAS data set that corresponds to any output table
For further details, see QUANTREG Procedure

QUANTSELECT Procedure

The QUANTSELECT procedure performs effect selection in the framework of quantile regression. A variety of effect selection methods are available, including greedy methods and penalty methods. PROC QUANTSELECT offers extensive capabilities for customizing the effect selection processes with a variety of candidate selecting, effect-selection stopping, and final-model choosing criteria. It also provides graphical summaries for the effect selection processes. The following are highlights of the QUANTSELECT procedure's features:

 supports the following model specifications: interaction (crossed) effects and nested effects constructed effects such as regression splines hierarchy among effects partitioning of data into training, validation, and testing roles provides the following selection controls: multiple methods for effect selection selection for quantile process and single quantile levels selection of individual or grouped effects selection based on a variety of selection criteria stopping rules based on a variety of model evaluation criteria provides graphical representations of the selection process provides output data sets that contain predicted values and residuals provides an output data set that contains the parameter estimates from a quantile process regression provides an output data set that contains the design matrix provides macro variables that contain selected effects
For further details, see QUANTSELECT Procedure

ROBUSTREG Procedure

The ROBUSTREG procedure provides resistant (stable) results for linear regression models in the presence of outliers. The following are highlights of the ROBUSTREG procedure's features:

 provides four estimation methods: M, LTS, S, and MM provides 10 weight functions for M estimation provides robust R2 and deviance for all estimates provides asymptotic covariance and confidence intervals for regression parameter with the M, S, and MM methods provides robust Wald and F tests for regression parameters with the M and MM methods provides outlier and leverage-point diagnostics supports parallel computing for S and LTS estimates performs BY group processing, which enables you to obtain separate analyses on grouped observations perform weighted estimation creates a SAS data set that contains the parameter estimates and the estimated covariance matrix creates an output SAS data set that contains statistics that are calculated after fitting the model creates a SAS data set that corresponds to any output table automatically creates fit plots and diagnostic plots by using ODS Graphics
For further details, see ROBUSTREG Procedure