The QUANTSELECT Procedure

Observation Quantile Level

The observation quantile level of a valid observation, $(y,\mb{x})$, is defined as $\tau _{(y,\mb{x})}=F_{Y|\mb{x}}(y)$, where $F_{Y|\mb{x}}(\cdot )$ denotes the cumulative distribution function (CDF) for the y’s underlying distribution conditional on $\mb{x}$. For the CDF that is continuous at y, the equation $y=Q_{Y|\mb{x}}\left(\tau _{(y,\mb{x})}\right)$ holds because the quantile function is inversely related to the CDF. Ideally, if $y=\mb{x}\hat{\bbeta }(\tau ^*)$ for a unique $\tau ^*\in [0,1]$ and some quantile-regression optimal solution $\hat{\bbeta }(\tau ^*)$, then $\tau ^*$ is a reasonable estimation for $\tau _{(y,\mb{x})}$, written as $\hat{\tau }_{(y,\mb{x})}=\tau ^*$. However, such a $\tau ^*$ might not exist or is nonunique in practice. The following steps show how the QUANTSELECT procedure estimates the observation quantile level $\tau _{(y,\mb{x})}$ via quantile process regression:

  1. Fit the quantile process regression model and label its quantile-level grid as follows:

    \[ \left\{ 0=\tau _{(0)}\le \tau _{(1)}\le \cdots \le \tau _{(s)}\le \tau _{(s+1)}=1\right\} \]
  2. Compute quantile predictions conditional on $\mb{x}$ in the quantile-level grid: $\left\{ q_ i=\mb{x}\hat{\bbeta }_ i: i=0,\ldots ,s+1\right\} $.

  3. Sort $q_ i$’s to avoid crossing, such that $q_{(0)}\le q_{(1)}\le \cdots \le q_{(s+1)}$.

  4. $\hat{\tau }_{(y,\mb{x})}=0$ if $y<q_{(0)}$, or $\hat{\tau }_{(y,\mb{x})}=1$ if $y> q_{(s+1)}$.

  5. Otherwise, search index j such that $q_{(j)}<y<q_{(j+1)}$. If such a j exists,

    \[ \hat{\tau }_{(y,\mb{x})} = \left({y-q_{(j)} \over q_{(j+1)}-q_{(j)}}\right)\tau _{(j+1)} +\left({q_{(j+1)}-y \over q_{(j+1)}-q_{(j)}}\right)\tau _{(j)} \]
  6. Otherwise, search j and k such that $q_{(j-1)}<y=q_{(j)}=\cdots =q_{(j+k)}<q_{(j+k+1)}$, and set $\displaystyle \hat{\tau }_{(y,\mb{x})} = {\tau _{(j)}+\tau _{(j+k)}\over 2}$. Here, define $q_{(-1)}=-\infty $ and $q_{(s+2)}=\infty $.