The QUANTREG Procedure

Confidence Interval

The QUANTREG procedure provides three methods to compute confidence intervals for the regression quantile parameter ${\bbeta }(\tau )$: sparsity, rank, and resampling. The sparsity method is the most direct and the fastest, but it involves estimation of the sparsity function, which is not robust for data that are not independently and identically distributed. To deal with this problem, the QUANTREG procedure computes a Huber sandwich estimate by using a local estimate of the sparsity function. The rank method, which computes confidence intervals by inverting the rank score test, does not suffer from this problem, but it uses the simplex algorithm and is computationally expensive with large data sets. The resampling method, which uses the bootstrap approach, addresses these problems, but at a computation cost.

Based on these properties, the QUANTREG uses a combination of the resampling and rank methods as the default. For data sets with more than either 5,000 observations or 20 variables, the QUANTREG procedure uses the MCMB resampling method; otherwise it uses the rank method. You can request a particular method by using the CI= option in the PROC QUANTREG statement.

Sparsity

Consider the linear model

\[  y_ i = \mb {x}_ i^{\prime }\bbeta + \epsilon _ i  \]

and assume that $\{ \epsilon _ i\} $, $i=1,\ldots ,n$, are iid with a distribution F and a density $f=F^{\prime }$, where $f(F^{-1}(\tau )) > 0$ in a neighborhood of $\tau $. Under some mild conditions

\[  \sqrt {n}({\hat\bbeta }(\tau ) - \bbeta (\tau )) \rightarrow N(0, \omega ^2(\tau , F) \bOmega ^{-1})  \]

where $\omega ^2(\tau , F) = \tau (1-\tau )\slash f^2(F^{-1}(\tau ))$ and $\bOmega =\lim _{n\rightarrow \infty } n^{-1}\sum \mb {x}_ i\mb {x}_ i^{\prime }.$ See Koenker and Bassett (1982b).

This asymptotic distribution for the regression quantile ${\hat\bbeta }(\tau )$ can be used to construct confidence intervals. However, the reciprocal of the density function

\[  s(\tau ) = [f(F^{-1}(\tau ))]^{-1}  \]

which is called the sparsity function, must first be estimated.

Since

\[  s(t) = {\frac d{dt}} F^{-1}(t)  \]

$s(t)$ can be estimated by the difference quotient of the empirical quantile function—that is,

\[  {\hat s}_ n(t) = [{\hat F}_ n^{-1}(t+h_ n) - {\hat F}_ n^{-1}(t-h_ n)] \slash 2h_ n \]

where ${\hat F}_ n$ is an estimate of $F^{-1}$ and $h_ n$ is a bandwidth that tends to zero as $n\rightarrow \infty $.

The QUANTREG procedure provides two bandwidth methods. The Bofinger bandwidth

\[  h_ n = n^{-1\slash 5} ( {\frac{4.5s^2(t)}{(s^{(2)}(t))^2}} )^{1\slash 5}  \]

is an optimizer of mean squared error for standard density estimation, and the Hall-Sheather bandwidth

\[  h_ n = n^{-1\slash 3} z_\alpha ^{2\slash 3} ({\frac{1.5 s(t)}{s^{(2)}(t)}} )^{1\slash 3}  \]

is based on Edgeworth expansions for studentized quantiles, where $s^{(2)}(t)$ is the second derivative of $s(t)$ and $z_\alpha $ satisfies $\Phi (z_\alpha ) = 1- \alpha \slash 2$ for the construction of $1-\alpha $ confidence intervals. The quantity

\[  {\frac{s(t)}{s^{(2)}(t)}} = {\frac{f^2}{2(f^{(1)} \slash f)^2 + [(f^{(1)} \slash f)^2 - f^{(2)}\slash f ] }}  \]

is not sensitive to f and can be estimated by assuming f is Gaussian.

${\hat F}^{-1}$ can be estimated by the empirical quantile function of the residuals from the quantile regression fit,

\[  {\hat F}^{-1}(t) = r_{(i)},\  {\mbox{ for }} t\in [(i-1)\slash n, i\slash n), \]

or the empirical quantile function of regression proposed by Bassett and Koenker (1982),

\[  {\hat F}^{-1}(t) = {\bar{\mb {x}}}^{\prime } {\hat\bbeta }(t) \]

The QUANTREG procedure interpolates the first empirical quantile function and gets the piecewise linear version

\[ {\hat F}^{-1}(t) = \left\{  \begin{array}{ll} r_{(1)} &  {\mbox{if }} t\in [0, 1\slash 2n) \\ \lambda r_{(i+1)} + (1-\lambda ) r_{(i)} &  {\mbox{if }} t\in [(2j-1)\slash 2n, (2i+1)\slash 2n) \\ r_{(n)} &  {\mbox{if }} t\in [(2n-1), 1] \\ \end{array} \right.  \]

${\hat F}^{-1}$ is set to a constant if $t\pm h_ n$ falls outside $[0,1]$.

This estimator of the sparsity function is sensitive to the iid assumption. Alternately, Koenker and Machado (1999) considered the non-iid case. By assuming local linearity of the conditional quantile function $Q(\tau |x)$ in x, they proposed a local estimator of the density function by using the difference quotient. A Huber sandwich estimate of the covariance and standard error is computed and used to construct the confidence intervals. One difficulty with this method is the selection of the bandwidth when using the difference quotient. With a small sample size, either the Bofinger or the Hall-Sheather bandwidth tends to be too large to assure local linearity of the conditional quantile function. The QUANTREG procedure uses a heuristic bandwidth selection in these cases.

By default, the QUANTREG procedure computes non-iid confidence intervals. You can request iid confidence intervals with the IID option in the PROC statement.

Inversion of Rank Tests

The classical theory of rank tests can be extended to test the hypothesis $H_0$: $\beta _2 = \eta $ in the linear regression model $\mb {y} = \bX _1\beta _1 + \bX _2\beta _2 + \bepsilon $. Here $(\bX _1, \bX _2) = \bA ^{\prime }$. See Gutenbrunner and Jureckova (1992) for more details. By inverting this test, confidence intervals can be computed for the regression quantiles that correspond to $\beta _2$.

The rank score function ${\hat a}_ n(t) =({\hat a}_{n1}(t),\ldots ,{\hat a}_{nn}(t))$ can be obtained by solving the dual problem

\[  \max _ a \{ (\mb {y}-\bX _2\bm {\eta })^{\prime }a | \bX _1^{\prime }a = (1-t)\bX _1^{\prime }\mb {e}, \  a\in [0,1]^ n \}   \]

For a fixed quantile $\tau $, integrating ${\hat a}_{ni}(t)$ with respect to the $\tau $-quantile score function

\[  \varphi _\tau (t) = \tau - I(t<\tau )  \]

yields the $\tau $-quantile scores

\[  {\hat b}_{ni} = -\int _0^1 \varphi _\tau (t) d {\hat a}_{ni}(t) = {\hat a}_{ni}(\tau ) -(1-\tau )  \]

Under the null hypothesis $H_0$: $\beta _2 = \eta $

\[  S_ n(\eta ) = n^{-1\slash 2} \bX _2^{\prime }{\hat b}_ n(\eta ) \rightarrow N(0,\tau (1-\tau ) \bOmega _ n) \]

for large n, where $\bOmega _ n = n^{-1} \bX _2^{\prime }(\bI -\bX _1(\bX _1^{\prime }\bX _1)^{-1} \bX _1^{\prime }) \bX _2$.

Let

\[ T_ n(\eta ) = {\frac1{\sqrt {\tau (1-\tau )}}} S_ n(\eta ) \bOmega _ n^{-1\slash 2} \]

Then $T_ n({\hat\beta }_2(\tau )) = 0$ from the constraint $\bA {\hat a} = (1-\tau ) \bA \mb {e}$ in the full model. In order to obtain confidence intervals for $\beta _2$, a critical value can be specified for $T_ n$. The dual vector ${\hat a}_ n(\eta )$ is a piecewise constant in $\eta $, and $\eta $ can be altered without compromising the optimality of ${\hat a}_ n(\eta )$ as long as the signs of the residuals in the primal quantile regression problem do not change. When $\eta $ gets to such a boundary, the solution does change, but can be restored by taking one simplex pivot. The process can continue in this way until $T_ n(\eta )$ exceeds the specified critical value. Since $T_ n(\eta )$ is piecewise constant, interpolation can be used to obtain the desired level of confidence interval; see Koenker and d’Orey (1994).

Resampling

The bootstrap can be implemented to compute confidence intervals for regression quantile estimates. As in other regression applications, both the residual bootstrap and the xy-pair bootstrap can be used. The former assumes iid random errors and resamples from the residuals, while the later resamples xy pairs and accommodates some forms of heteroscedasticity. Koenker (1994) considered a more interesting resampling mechanism, resampling directly from the full regression quantile process, which he called the Heqf bootstrap.

In contrast with these bootstrap methods, Parzen, Wei, and Ying (1994) observed that

\[  S(\bbeta ) = n^{-1\slash 2} \sum _{i=1}^ n \mb {x}_ i(\tau -I(y_ i\leq \mb {x}_ i^{\prime }\bbeta ))  \]

which is the estimating equation for the $\tau $ regression quantile, is a pivotal quantity for the $\tau $ quantile regression parameter $\beta _\tau $. In other words, the distribution of $\bS (\bbeta )$ can be generated exactly by a random vector $\bU $, which is a weighted sum of independent, re-centered Bernoulli variables. They further showed that for large n, the distribution of ${\hat\bbeta }(\tau ) -\beta _\tau $ can be approximated by the conditional distribution of ${\hat\bbeta }_ U- {\hat\bbeta }_ n(\tau )$, where ${\hat\bbeta }_ U$ solves an augmented quantile regression problem with n + 1 observations with $\mb {x}_{n+1}= -n^{-1\slash 2} u \slash \tau $ and $y_{n+1}$ sufficiently large for a given realization of u. By exploiting the asymptotically pivot role of the quantile regression gradient condition, this approach also achieves some robustness to certain heteroscedasticity.

Although the bootstrap method by Parzen, Wei, and Ying (1994) is much simpler, it is too time-consuming for relatively large data sets, especially for high-dimensional data sets. The QUANTREG procedure implements a new, general resampling method developed by He and Hu (2002), which is referred to as the Markov chain marginal bootstrap (MCMB). For quantile regression, the MCMB method has the advantage that it solves p one-dimensional equations instead of p-dimensional equations, as do the previous bootstrap methods. This greatly improves the feasibility of the resampling method in computing confidence intervals for regression quantiles.