The QLIM Procedure

Automated MCMC

Subsections:

Stationarity Phase
Accuracy Phase

The main purpose is to provide the user with the opportunity of obtaining a good approximation of the posterior distribution without initializing the MCMC algorithm: initial values, proposal distributions, burn-in and number of samples.

The automated algorithm is composed of two phases: tuning and sampling. In the tuning phase, there are two main concerns: the choice of a good proposal distribution and the search for the stationary region of the posterior distribution. In the sampling phase, the algorithm will decide how many samples are necessary to obtain good approximations of the posterior mean and some quantiles of interest.

Stationarity Phase

During the stationarity phase, the algorithm tries to search for a good proposal distribution and, at the same time, to reach the stationary region of the posterior. The choice of the proposal distribution is based on the analysis of the acceptance rates. This is similar to what is done in PROC MCMC: for more details see Tuning the Proposal Distribution: Tuning the Proposal Distribution in SAS/STAT 14.1 User's Guide. For the stationarity analysis, the main idea is to run two tests, Geweke (Ge) and Heidleberger-Welch (HW), on the posterior chains at the end of each attempt. For more details, see Geweke Diagnostics: Geweke Diagnostics in SAS/STAT 14.1 User's Guide, and Heidelberger and Welch Diagnostics: Heidelberger and Welch Diagnostics in SAS/STAT 14.1 User's Guide. If the stationarity hypothesis is rejected, then the tuning samples are increased and the tests repeated in the next attempt. After 10 attempts, the stationarity phase will be ended regardless of the results. The tuning parameters for the first attempt are fixed:

$\begin{equation*} \begin{tabular}{ll} 1000 & burn-in (nbi), \\ 5000 & tuning samples (ntu), \\ 1000 & MCMC samples (nmc). \end{tabular}\end{equation*}$

For the remaining attempts, the tuning parameters will be adjusted dynamically. More specifically, each parameter will be assigned an acceptance ratio (AR) of the stationarity hypothesis:

$\begin{equation*} \begin{tabular}{lll} $AR_ i=0$ & if & both tests reject the stationarity hypothesis, \\ $ AR_ i=0.5$ & if & one tests rejects and the other does not, \\ $AR_ i=1$ & if & both tests do not reject the stationarity hypothesis, \end{tabular}\end{equation*}$

for $i=1,\ldots ,k$ . For the Geweke test, the implemented significance level is 0.05. Then, an overall stationarity average ( $SA$ ) for all parameters ratios is evaluated,

$\begin{equation} \label{eq:SA} SA=\sum \limits _{i=1}^ k\frac{AR_ i}{k}, \end{equation}$

and the number of tuning samples is updated accordingly:

$\begin{equation*} \begin{tabular}{llc} $\text {ntu}=\text {ntu}+2000$ & if & \qquad \: \: $SA<70\% $, \\ $\text {ntu}=\text {ntu}+1000$ & if & $70\% \leq SA<100\% $, \\ $\text {ntu}=\text {ntu}$ & if & \qquad \; \; \; $SA=100\% $. \end{tabular}\end{equation*}$

The Heidelberger-Welch test also provides an indications of how much burn-in should be used. The algorithm requires this burn-in to be: $\text {nbi}(\text {HW})=0$ . If that is not the case, the burn-in will updated accordingly, $\text {nbi}= \text {nbi}+\text {nbi}(\text {HW})$ , and a new attempt searching for stationarity will be implemented. This choice is motivated by the fact that the burn-in must be discarded in order to reach the stationary region of the posterior distribution.

The number of samples is updated at each attempt. However, in order to exit the stationarity phase, it will not be required $\text {nmc}(\text {RL})=0$ . The default update is $\text {nmc}= \text {nmc}+1000$ . Depending on the outcome of the Raftery-Lewis diagnostics, if $\text {nmc}<\min \left\{ \text {LB}\left[\text {nmc}(\text {RL})\right],\text {nmc}(\text {RL})\right\}$ , the number of sampling is further updated to $\text {nmc}= \text {LB}\left[\text {nmc}(\text {RL})\right]$ . By default, $\text {LB}\left[\text {nmc}(\text {RL})\right]=10000$ . Finally, if the number of projected samples is not sufficient to perform a stable evaluation of the Raftery-Lewis test, the number of samples is updated to $\text {nmc}= \min \left[\text {nmc}(\text {RL})\right]$ . For more details see AUTOMCMC <phrase remap="Optional">=(<phrase remap="Argument">automcmc-options</phrase>)</phrase> and Raftery and Lewis Diagnostics: Raftery and Lewis Diagnostics in SAS/STAT 14.1 User's Guide.

Accuracy Phase

The main idea of the accuracy phase is to make sure that the mean and a quantile of interest are evaluated accurately. This can be tested by implementing the half-width test by Heidelberger-Welch and by analyzing the Raftery-Lewis diagnostic tool. In addition, the requirements defined in the stationarity phase will also be checked: the Geweke and the Heidelberger-Welch tests must not reject the stationary hypothesis and the burn-in predicted by the Heidelberger-Welch test must be zero.

The accuracy phase is characterized by a maximum of 10 attempts. If the algorithm exceeds this limit, the accuracy phase will end and indications on how to improve sampling will be given. The search of accuracy can be performed using two different method. The first method (the default) is triggered by the option TARGETSTATS and it is based on the accuracy analysis of the mean and a percentile of interest. The secong method is triggered by the option TARGETESS and it targets a minimum number of effective samples. The accuracy phase will first update the burn-in with the information provided by the HW test: $\text {nbi}= \text {nbi}+\text {nbi}(\text {HW})$ . Then, it determines the difference between the actual number of samples and the number of samples predicted by either the RL test or the ESS: $\Delta [\text {nmc}]=\text {nmc}(\text {RL})-\text {nmc},$ or $\Delta [\text {nmc}]=\text {nmc}(\text {ESS})-\text {nmc}.$ The new number of samples will be updated accordingly:

$\begin{equation*} \begin{tabular}{lll} $\text {nmc}=\text {nmc}+\text {LB}\left[\text {nmc}(\text {RL})\right]$ & if & \qquad \qquad \quad \, \, $0<\Delta [\text {nmc}]\leq \text {LB}\left[\text {nmc}(\text {RL})\right],$ \\ $\text {nmc}=\text {nmc}+\Delta [\text {nmc}]$ & if & $\text {LB}\left[\text {nmc}(\text {RL})\right]<\Delta [\text {nmc}]\leq \text {UB}\left[\text {nmc}(\text {RL})\right],$ \\ $\text {nmc}=\text {nmc}+\text {UB}\left[\text {nmc}(\text {RL})\right]$ & if & $\text {UB}\left[\text {nmc}(\text {RL})\right]<\Delta [\text {nmc}].$ \end{tabular}\end{equation*}$

By default, $\text {LB}\left[\text {nmc}(\text {RL})\right]=10000$ and $\text {UB}\left[\text {nmc}(\text {RL})\right]=300000$ .

In addition, the accuracy search triggered by the option TARGETSTATS also implements the HW half-width test to checks whether the sample mean is accurate. If the mean of any parameters is not considered to be accurate and the number of samples has not been updated based on $\Delta [\text {nmc}]$ , then the number of samples is increased:

$\begin{equation*} \begin{tabular}{lll} $\text {nmc}=\text {nmc}+5000$ & if & $\Delta [\text {nmc}]\leq 0$, \\ \end{tabular}\end{equation*}$