Quantile Regression: The QUANTREG and QUANTSELECT Procedures

PROC QUANTREG models the effects of covariates on the conditional quantiles of a response variable by using quantile regression. PROC QUANTSELECT performs quantile regression with model selection.

Ordinary least squares regression models the relationship between one or more covariates X and the conditional mean of the response variable $\mr {E}[Y|X=x]$. Quantile regression extends the regression model to conditional quantiles of the response variable, such as the 90th percentile. Quantile regression is particularly useful when the rate of change in the conditional quantile, expressed by the regression coefficients, depends on the quantile. An advantage of quantile regression over least squares regression is its flexibility in modeling data that have heterogeneous conditional distributions. Data of this type occur in many fields, including biomedicine, econometrics, and ecology.

Features of PROC QUANTREG include the following:

  • simplex, interior point, and smoothing algorithms for estimation

  • sparsity, rank, and resampling methods for confidence intervals

  • asymptotic and bootstrap methods to estimate covariance and correlation matrices of the parameter estimates

  • Wald and likelihood ratio tests for the regression parameter estimates

  • regression quantile spline fits

The QUANTSELECT procedure shares most of its syntax and output format with PROC GLMSELECT and PROC QUANTREG. Features of PROC QUANTSELECT include the following:

  • a variety of selection methods, including forward, backward, stepwise, and LASSO

  • a variety of criteria, including AIC, AICC, ADJR1, SBC, significance level, testing ACL, and validation ACL

  • effect selection both for individual quantile levels and for the entire quantile process

  • a SAS data set that contains the design matrix

  • macro variables that enable you to easily specify the selected model by using a subsequent PROC QUANTREG step