The QLIM Procedure

Automated MCMC

The main purpose is to provide the user with the opportunity of obtaining a good approximation of the posterior distribution without initializing the MCMC algorithm: initial values, proposal distributions, burn-in and number of samples.

The automated algorithm is composed of two phases: tuning and sampling. In the tuning phase, there are two main concerns: the choice of a good proposal distribution and the search for the stationary region of the posterior distribution. In the sampling phase, the algorithm will decide how many samples are necessary to obtain good approximations of the posterior mean and some quantiles of interest.

Tuning Phase

During the tuning phase, the algorithm tries to search for a good proposal distribution and, at the same time, to reach the stationary region of the posterior. The choice of the proposal distribution is based on the analysis of the acceptance rates. This is similar to what is done in PROC MCMC: for more details see Tuning the Proposal Distribution: Tuning the Proposal Distribution in SAS/STAT 13.2 User's Guide. For the stationarity analysis, the main idea is to run two tests, Geweke (Ge) and Heidleberger-Welch (HW), on the posterior chains at the end of each attempt. For more details, see Geweke Diagnostics: Geweke Diagnostics in SAS/STAT 13.2 User's Guide, and Heidelberger and Welch Diagnostics: Heidelberger and Welch Diagnostics in SAS/STAT 13.2 User's Guide. If the stationarity hypothesis is rejected, then the tuning samples are increased and the tests repeated in the next attempt. After 10 attempts, the tuning phase will be ended regardless of the results. The tuning parameters for the first attempt are fixed:

\begin{equation*} \begin{tabular}{ll} 0   &  burn-in (nbi),  \\ 1000   & tuning samples (ntu),  \\ 10000   &  MCMC samples (nmc).   \end{tabular}\end{equation*}

For the remaining attempts, the tuning parameters will be adjusted dynamically. More specifcally, each parameter will be assigned an acceptance ratio (AR) of the stationarity hypothesis:

\begin{equation*} \begin{tabular}{lll} $AR_ i=0$  &  if  &  both tests reject the stationarity hypothesis,  \\ $ AR_ i=0.5$  &  if  &  one tests rejects and the other does not,   \\ $AR_ i=1$  &  if  &  both tests do not reject the stationarity hypothesis,   \end{tabular}\end{equation*}

for $i=1,\ldots ,k$. Then, an overall stationarity average ($SA$) for all parameters ratios is evaluated,

\begin{equation}  \label{eq:SA} SA=\sum \limits _{i=1}^ k\frac{AR_ i}{k}, \end{equation}

and the number of tuning samples is updated accordingly:

\begin{equation*} \begin{tabular}{llc} $\text {ntu}=\text {ntu}+2000$  & if  & \qquad \: \:  $SA<70\% $,  \\ $\text {ntu}=\text {ntu}+1000$  & if  & $70\% \leq SA<100\% $,  \\ $\text {ntu}=\text {ntu}$  & if  & \qquad \; \; \;  $SA=100\% $.   \end{tabular}\end{equation*}

The Heidelberger-Welch test also provides an indications of how much burn-in should be used. The algorithm requires this burn-in to be: $\text {nbi}(\text {HW})=0$. If that is not the case, the burn-in will updated accordingly, $\text {nbi}= \text {nbi}+\text {nbi}(\text {HW})$, and a new tuning attempt will be implemented. This choice is motivated by the fact that the burn-in must be discarded in order to reach the stationary region of the posterior distribution.

The number of samples predicted by the Raftery-Lewis nmc(RL) diagnostic tool will be also taken into consideration: $\text {nmc}= \text {nmc}+ \text {nmc}(\text {RL})$. For more details see Raftery and Lewis Diagnostics: Raftery and Lewis Diagnostics in SAS/STAT 13.2 User's Guide. However, in order to exit the tuning phase, it will not be required $\text {nmc}(\text {RL})=0$.

Sampling Phase

The main idea of the sampling phase is to make sure that the mean and a quantile of interest are evaluated accurately. This can be tested by implementing the half-width test by Heidelberger-Welch and by analyzing the Raftery-Lewis diagnostic tool. In addition, the requirements defined in the tuning phase will also be checked: the Geweke and the Heidelberger-Welch tests must not reject the stationary hypothesis and the burn-in predicted by the Heidelberger-Welch test must be zero.

The sampling phase is characterized by a maximum of 10 attempts. If the algorithm exceeds this limit, the sampling phase will end and indications on how to improve sampling will be given. The sampling phase will first update the burn-in with the information provided by the HW test: $\text {nbi}= \text {nbi}+\text {nbi}(\text {HW})$. Then, it determines the difference between the actual number of samples and the number of samples predicted by the RL test: $\Delta [\text {nmc}(\text {RL})]=\text {nmc}(\text {RL})-\text {nmc}$. The new number of samples will be updated accordingly:

\begin{equation*} \begin{tabular}{lll} $\text {nmc}=\text {nmc}+1000$  & if  & \qquad \, \, \, $0<\Delta [\text {nmc}(\text {RL})]\leq 10000,$  \\ $\text {nmc}=\text {nmc}+\Delta [\text {nmc}(\text {RL})]$  & if  & \, \, \, $10000<\Delta [\text {nmc}(\text {RL})]\leq 300000,$  \\ $\text {nmc}=\text {nmc}+300000$  & if  & $300000<\Delta [\text {nmc}(\text {RL})].$   \end{tabular}\end{equation*}

Finally, the HW half-width test checks whether the sample mean is accurate. If the mean of any parameters is not considered to be accurate, the number of sampling is increased:

\begin{equation*} \begin{tabular}{lll} $\text {nmc}=\text {nmc}+\{ 10000-\Delta [\text {nmc}(\text {RL})]\} $  & if  & $10000-\Delta [\text {nmc}(\text {RL})]\geq 0$,  \\ $\text {nmc}=\text {nmc}$  & if  & $10000-\Delta [\text {nmc}(\text {RL})]<0$.   \end{tabular}\end{equation*}

Standard Distributions

Table 22.3 through Table 22.8 show all the distribution density functions that PROC QLIM recognizes. You specify these distribution densities in the PRIOR statement.

Table 22.3: Beta Distribution

PRIOR statement

BETA(SHAPE1=a, SHAPE2=b, MIN=m, MAX=$M$)

 

Note: Commonly $m=0$ and $M=1$.

Density

$\frac{(\theta -m)^{a-1} (M-\theta )^{b-1}}{B(a,b)(M-m)^{a+b-1}}$

Parameter restriction

$a>0$, $b>0$, $-\infty <m<M<\infty $

Range

$ \left\{  \begin{array}{ll} \left[ m, M \right] &  \mbox{when } a = 1, b = 1 \\ \left[ m, M \right) &  \mbox{when } a = 1, b \neq 1 \\ \left( m, M \right] &  \mbox{when } a \neq 1, b = 1 \\ \left( m, M \right) &  \mbox{otherwise} \end{array} \right. $

Mean

$ \frac{a}{a+b}\times (M-m)+m$

Variance

$ \frac{ab}{(a+b)^2(a+b+1)}\times (M-m)^2$

Mode

$ \left\{  \begin{array}{ll} \frac{a-1}{a+b-2}\times M+\frac{b-1}{a+b-2}\times m &  a > 1, b > 1 \\ m \mbox{ and } M &  a < 1, b < 1 \\ m &  \left\{  \begin{array}{l} a < 1, b \geq 1 \\ a = 1, b > 1 \\ \end{array} \right. \\ M &  \left\{  \begin{array}{l} a \geq 1, b < 1 \\ a > 1, b = 1 \\ \end{array} \right. \\ \mbox{not unique} &  a = b = 1 \end{array} \right. $

Defaults

SHAPE1=SHAPE2=1, $\Variable{MIN}\rightarrow -\infty $, $\Variable{MAX}\rightarrow \infty $


Table 22.4: Gamma Distribution

PRIOR statement

GAMMA(SHAPE=a, SCALE=b )

Density

$\frac{1}{b^ a\Gamma (a)} \theta ^{a-1} e^{-\theta /b} $

Parameter restriction

$ a > 0, b > 0 $

Range

$[0,\infty )$

Mean

$ab$

Variance

$ab^2$

Mode

$(a-1)b$

Defaults

SHAPE=SCALE=1


Table 22.5: Inverse-Gamma Distribution

PRIOR statement

IGAMMA(SHAPE=a, SCALE=b)

Density

$ \frac{b^ a}{\Gamma (a)} \theta ^{-(a+1)}e^{-b/\theta } $

Parameter restriction

$ a > 0, b > 0$

Range

$ 0<\theta <\infty $

Mean

$\frac{b}{a-1},\qquad a > 1$

Variance

$\frac{b^2}{(a-1)^2(a-2)},\qquad a>2$

Mode

$ \frac{b}{a+1}$

Defaults

SHAPE=2.000001, SCALE=1


Table 22.6: Normal Distribution

PRIOR statement

NORMAL(MEAN=$\mu $, VAR=$\sigma ^2$)

Density

$ \frac{1}{\sigma \sqrt {2\pi }} \exp \left( - \frac{(\theta - \mu )^2}{2\sigma ^2}\right) $

Parameter restriction

$ \sigma ^2 > 0 $

Range

$ -\infty <\theta <\infty $

Mean

$\mu $

Variance

$\sigma ^2$

Mode

$\mu $

Defaults

MEAN=0, VAR=1000000


Table 22.7: t Distribution

PRIOR statement

T(LOCATION=$\mu $, DF=$\nu $)

Density

$\frac{\Gamma \left(\frac{\nu +1}{2}\right)}{\Gamma \left(\frac{\nu }{2}\right)\sqrt {\pi \nu }}\left[1+\frac{(\theta -\mu )^2}{\nu }\right]^{-\frac{\nu +1}{2}} $

Parameter restriction

$ \nu > 0 $

Range

$ -\infty <\theta <\infty $

Mean

$\mu , \text { for }\nu >1$

Variance

$\frac{\nu }{\nu -2}, \text { for }\nu >2$

Mode

$\mu $

Defaults

LOCATION=0, DF=3


Table 22.8: Uniform Distribution

PRIOR statement

UNIFORM(MIN=m, MAX=$M$)

Density

$ \frac{1}{M-m}$

Parameter restriction

$-\infty <m<M<\infty $

Range

$ \theta \in [m, M]$

Mean

$ \frac{m+M}{2} $

Variance

$\frac{(M-m)^2}{12}$

Mode

Not unique

Defaults

MIN$\rightarrow -\infty $, MAX$\rightarrow \infty $