Quasi-Likelihood Information Criteria

Given quantile level $\tau $, assume that the distribution of $Y_ i$ conditional on $\mb{x}_ i$ follows the linear model

\[ Y_ i = \mb{x}_ i^{\prime }\bbeta + \epsilon _ i \]

where $\epsilon _ i$ for $i=1,\ldots ,n$ are iid in distribution F. Further assume that F is an asymmetric Laplace distribution whose density function is

\[ f_\tau (r)={\tau (1-\tau )\over \sigma }\exp \left(-{\rho _\tau (r)\over \sigma }\right) \]

where $\sigma $ is the scale parameter. Then, the associated -log likelihood function is

\[ l_\tau (\bbeta ,\sigma )=n\log (\sigma )+ \sigma ^{-1}\sum _{i=1}^ n\rho _\tau (y_ i-\mb{x}_ i’\bbeta )-n\log (\tau (1-\tau )) \]

Under these settings, the maximum likelihood estimate (MLE) of $\bbeta $ is the same as the relevant level $\tau $ quantile regression solution

\[ \hat{\bbeta }(\tau )=\underset {\bbeta }{\mr{arg min}}\sum _{i=1}^ n \rho _\tau \left(y_ i-\mb{x}_ i^{\prime }\bbeta \right) \]

The MLE for $\sigma $ is

\[ \hat{\sigma }(\tau )=n^{-1}\sum _{i=1}^ n \rho _\tau \left(y_ i-\mb{x}_ i^{\prime }\hat{\bbeta }(\tau )\right) \]

where $\hat{\sigma }(\tau )$ equals the level $\tau $ average check loss, $\mbox{ACL}(\tau )$, for the quantile regression solution.

According to the general form of Akaike’s information criterion, $\mbox{AIC}=(-2l+2p)$, the quasi-likelihood AIC for quantile regression is

\[ \mbox{AIC}(\tau )=2n\ln \left( \mbox{ACL}(\tau ) \right) + 2p \]

where p is the degrees of freedom for the fitted model.

Similarly, the quasi-likelihood corrected AIC and Schwarz Bayesian information criterion can be formulated respectively as follows:

\[ \mbox{AICC}(\tau )=2n\ln \left(\mbox{ACL}(\tau )\right)+{2pn\over n-p-1} \]
\[ \mbox{SBC}(\tau )=2n\ln \left(\mbox{ACL}(\tau )\right)+p\ln (n) \]

In fact, the quasi-likelihood AIC, AICC, and SBC are fairly robust, and they can be used to select effects for data sets without the iid assumption in asymmetric Laplace distribution. See Simulation Study for a simulation study that applies SBC for effect selection on a data set that is generated from a naive instrumental model (Chernozhukov and Hansen 2008).