The QUANTREG Procedure

Quantile Regression as an Optimization Problem

The model for linear quantile regression is

$\mb{y} = \bA ^{\prime }\bbeta + \bepsilon$

where $\mb{y}=(y_1,\ldots ,y_ n)^{\prime }$ is the $(n \times 1)$ vector of responses, $\bA ^{\prime }=(\mb{x}_1,\ldots ,\mb{x}_ n)^{\prime }$ is the $(n \times p)$ regressor matrix, $\bbeta = (\beta _1,\ldots ,\beta _ p)^{\prime }$ is the $(p \times 1)$ vector of unknown parameters, and $\bepsilon = (\epsilon _1,\ldots ,\epsilon _ n)^{\prime }$ is the $(n \times 1)$ vector of unknown errors.

$L_1$ regression, also known as median regression, is a natural extension of the sample median when the response is conditioned on the covariates. In $L_1$ regression, the least absolute residuals estimate ${\hat\bbeta }_{\mathit{LAR}}$ , referred to as the $L_1$ -norm estimate, is obtained as the solution of the following minimization problem:

$\min _{\bbeta \in \mb{R}^ p} \sum _{i=1}^ n | y_ i - \mb{x}_ i^{\prime }\bbeta |$

More generally, for quantile regression Koenker and Bassett (1978) defined the $\tau$ regression quantile, $0<\tau <1$ , as any solution to the following minimization problem:

$\min _{\bbeta \in \mb{R}^ p} \left[\sum _{i\in \{ i: y_ i\geq \mb{x}_ i^{\prime }\bbeta \} } \tau |y_ i-\mb{x}_ i^{\prime }\bbeta | + \sum _{i\in \{ i: y_ i< \mb{x}_ i^{\prime }\bbeta \} } (1-\tau ) |y_ i-\mb{x}_ i^{\prime }\bbeta |\right]$

The solution is denoted as $\hat{\bbeta }(\tau )$ , and the $L_1$ -norm estimate corresponds to $\hat{\bbeta }(1\slash 2)$ . The $\tau$ regression quantile is an extension of the $\tau$ sample quantile $\hat\xi (\tau )$ , which can be formulated as the solution of

$\min _{\xi \in \mb{R}} \left[ \sum _{i\in \{ i: y_ i\geq \xi \} } \tau |y_ i-\xi | + \sum _{i\in \{ i: y_ i< \xi \} } (1-\tau ) |y_ i-\xi | \right]$

If you specify weights $w_ i, i=1,\ldots ,n$ , with the WEIGHT statement, weighted quantile regression is carried out by solving

$\min _{\bbeta _ w \in \mb{R}^ p} \left[\sum _{i\in \{ i: y_ i\geq \mb{x}_ i^{\prime }\bbeta _ w\} } w_ i \tau |y_ i-\mb{x}_ i^{\prime }\bbeta _ w| + \sum _{i\in \{ i: y_ i< \mb{x}_ i^{\prime }\bbeta _ w\} } w_ i (1-\tau ) |y_ i-\mb{x}_ i^{\prime }\bbeta _ w|\right]$

Weighted regression quantiles $\bbeta _ w$ can be used for L-estimation (Koenker and Zhao, 1994).