The COUNTREG Procedure

Automated MCMC Algorithm

Subsections:

Tuning Phase
Sampling Phase
Standard Distributions

The main purpose of the automated MCMC algorithm is to provide the user with the opportunity to obtain a good approximation of the posterior distribution without initializing the MCMC algorithm: initial values, proposal distributions, burn-in, and number of samples.

The automated MCMC algorithm has two phases: tuning and sampling. The tuning phase has two main concerns: the choice of a good proposal distribution and the search for the stationary region of the posterior distribution. In the sampling phase, the algorithm decides how many samples are necessary to obtain good approximations of the posterior mean and some quantiles of interest.

Tuning Phase

During the tuning phase, the algorithm tries to search for a good proposal distribution and, at the same time, to reach the stationary region of the posterior. The choice of the proposal distribution is based on the analysis of the acceptance rates. This is similar to what is done in PROC MCMC; for more information, see Chapter 61: The MCMC Procedure in SAS/STAT 13.2 User's Guide. For the stationarity analysis, the main idea is to run two tests, Geweke (Ge) and Heidleberger-Welch (HW), on the posterior chains at the end of each attempt. For more information, see Chapter 7: Introduction to Bayesian Analysis Procedures in SAS/STAT 13.2 User's Guide. If the stationarity hypothesis is rejected, then the number of tuning samples is increased and the tests are repeated in the next attempt. After 10 attempts, the tuning phase ends regardless of the results. The tuning parameters for the first attempt are fixed:

$\begin{equation*} \begin{tabular}{ll} 0 & burn-in (nbi) \\ 1000 & tuning samples (ntu) \\ 10000 & MCMC samples (nmc) \end{tabular}\end{equation*}$

For the remaining attempts, the tuning parameters are adjusted dynamically. More specifically, each parameter is assigned an acceptance ratio (AR) of the stationarity hypothesis,

$\begin{equation*} \begin{tabular}{ll} AR$_ i=0$ & if both tests reject the stationarity hypothesis \\ AR$_ i=0.5$ & if only one test rejects the stationarity hypothesis \\ AR$_ i=1$ & if neither test rejects the stationarity hypothesis \end{tabular}\end{equation*}$

for $i=1,\ldots ,k$ . Then, an overall stationarity average (SA) for all parameters ratios is evaluated,

$\begin{equation*} \textnormal{SA}=\sum \limits _{i=1}^ k\frac{\textnormal{AR}_ i}{k} \end{equation*}$

and the number of tuning samples is updated accordingly:

$\begin{equation*} \begin{tabular}{llc} $\text {ntu}=\text {ntu}+2000$ & if & \qquad \: \: $\textnormal{SA}<70\% $ \\ $\text {ntu}=\text {ntu}+1000$ & if & $70\% \leq \textnormal{SA}<100\% $ \\ $\text {ntu}=\text {ntu}$ & if & \qquad \; \; \; $\textnormal{SA}=100\% $ \end{tabular}\end{equation*}$

The Heidelberger-Welch test also provides an indication of how much burn-in should be used. The algorithm requires this burn-in to be $\text {nbi}(\text {HW})=0$ . If that is not the case, the burn-in is updated accordingly, $\text {nbi}= \text {nbi}+\text {nbi}(\text {HW})$ , and a new tuning attempt is made. This choice is motivated by the fact that the burn-in must be discarded in order to reach the stationary region of the posterior distribution.

The number of samples that the Raftery-Lewis nmc(RL) diagnostic tool predicts is also considered: $\text {nmc}= \text {nmc}+ \text {nmc}(\text {RL})$ . For more information, see Chapter 7: Introduction to Bayesian Analysis Procedures in SAS/STAT 13.2 User's Guide. However, in order to exit the tuning phase, $\text {nmc}(\text {RL})=0$ is not required.

Sampling Phase

The main purpose of the sampling phase is to make sure that the mean and a quantile of interest are evaluated accurately. This can be tested by implementing the Heidelberger-Welch halfwidth test and by using the Raftery-Lewis diagnostic tool. In addition, the requirements that are defined in the tuning phase are also checked: the Geweke and Heidelberger-Welch tests must not reject the stationary hypothesis, and the burn-in that the Heidelberger-Welch test predicts must be zero.

The sampling phase has a maximum of 10 attempts. If the algorithm exceeds this limit, the sampling phase ends and indications of how to improve sampling are given. The sampling phase first updates the burn-in with the information provided by the Heidelberger-Welch test: $\text {nbi}= \text {nbi}+\text {nbi}(\text {HW})$ . Then, it determines the difference between the actual number of samples and the number of samples that are predicted by the Raftery-Lewis test: $\Delta [\text {nmc}(\text {RL})]=\text {nmc}(\text {RL})-\text {nmc}$ . The new number of samples is updated accordingly:

$\begin{equation*} \begin{tabular}{lll} $\text {nmc}=\text {nmc}+1000$ & if & \qquad \, \, \, $0<\Delta [\text {nmc}(\text {RL})]\leq 10000$ \\ $\text {nmc}=\text {nmc}+\Delta [\text {nmc}(\text {RL})]$ & if & \, \, \, $10000<\Delta [\text {nmc}(\text {RL})]\leq 300000$ \\ $\text {nmc}=\text {nmc}+300000$ & if & $300000<\Delta [\text {nmc}(\text {RL})]$ \end{tabular}\end{equation*}$

Finally, the Heidelberger-Welch halfwidth test checks whether the sample mean is accurate. If the mean of any parameters is not considered to be accurate, the number of samples is increased:

$\begin{equation*} \begin{tabular}{lll} $\text {nmc}=\text {nmc}+\{ 10000-\Delta [\text {nmc}(\text {RL})]\} $ & if & $10000-\Delta [\text {nmc}(\text {RL})]\geq 0$ \\ $\text {nmc}=\text {nmc}$ & if & $10000-\Delta [\text {nmc}(\text {RL})]<0$ \end{tabular}\end{equation*}$

Standard Distributions

Table 11.4 through Table 11.9 show all the distribution density functions that PROC COUNTREG recognizes. You specify these distribution densities in the PRIOR statement.

Table 11.4: Beta Distribution

PRIOR statement	BETA(SHAPE1=a, SHAPE2=b, MIN=m, MAX=M)
	Note: Commonly $m=0$ and $M=1$ .
Density	$\frac{(\theta -m)^{a-1} (M-\theta )^{b-1}}{B(a,b)(M-m)^{a+b-1}}$
Parameter restriction	$a>0$ , $b>0$ , $-\infty <m<M<\infty$
Range	$\left\{ \begin{array}{ll} \left[ m, M \right] & \mbox{when } a = 1, b = 1 \\ \left[ m, M \right) & \mbox{when } a = 1, b \neq 1 \\ \left( m, M \right] & \mbox{when } a \neq 1, b = 1 \\ \left( m, M \right) & \mbox{otherwise} \end{array} \right.$
Mean	$\frac{a}{a+b}\times (M-m)+m$
Variance	$\frac{ab}{(a+b)^2(a+b+1)}\times (M-m)^2$
Mode	$\left\{ \begin{array}{ll} \frac{a-1}{a+b-2}\times M+\frac{b-1}{a+b-2}\times m & a > 1, b > 1 \\ m \mbox{ and } M & a < 1, b < 1 \\ m & \left\{ \begin{array}{l} a < 1, b \geq 1 \\ a = 1, b > 1 \\ \end{array} \right. \\ M & \left\{ \begin{array}{l} a \geq 1, b < 1 \\ a > 1, b = 1 \\ \end{array} \right. \\ \mbox{not unique} & a = b = 1 \end{array} \right.$
Defaults	SHAPE1=SHAPE2=1, $\Variable{MIN}\rightarrow -\infty$ , $\Variable{MAX}\rightarrow \infty$

Table 11.5: Gamma Distribution

PRIOR statement	GAMMA(SHAPE=a, SCALE=b )
Density	$\frac{1}{b^ a\Gamma (a)} \theta ^{a-1} e^{-\theta /b}$
Parameter restriction	$a > 0, b > 0$
Range	$[0,\infty )$
Mean	$ab$
Variance	$ab^2$
Mode	$(a-1)b$
Defaults	SHAPE=SCALE=1

Table 11.6: Inverse Gamma Distribution

PRIOR statement	IGAMMA(SHAPE=a, SCALE=b)
Density	$\frac{b^ a}{\Gamma (a)} \theta ^{-(a+1)}e^{-b/\theta }$
Parameter restriction	$a > 0, b > 0$
Range	$0<\theta <\infty$
Mean	$\frac{b}{a-1},\qquad a > 1$
Variance	$\frac{b^2}{(a-1)^2(a-2)},\qquad a>2$
Mode	$\frac{b}{a+1}$
Defaults	SHAPE=2.000001, SCALE=1

Table 11.7: Normal Distribution

PRIOR statement	NORMAL(MEAN= $\mu$ , VAR= $\sigma ^2$ )
Density	$\frac{1}{\sigma \sqrt {2\pi }} \exp \left( - \frac{(\theta - \mu )^2}{2\sigma ^2}\right)$
Parameter restriction	$\sigma ^2 > 0$
Range	$-\infty <\theta <\infty$
Mean	$\mu$
Variance	$\sigma ^2$
Mode	$\mu$
Defaults	MEAN=0, VAR=1000000

Table 11.8: t Distribution

PRIOR statement	T(LOCATION= $\mu$ , DF= $\nu$ )
Density	$\frac{\Gamma \left(\frac{\nu +1}{2}\right)}{\Gamma \left(\frac{\nu }{2}\right)\sqrt {\pi \nu }}\left[1+\frac{(\theta -\mu )^2}{\nu }\right]^{-\frac{\nu +1}{2}}$
Parameter restriction	$\nu > 0$
Range	$-\infty <\theta <\infty$
Mean	$\mu , \text { for }\nu >1$
Variance	$\frac{\nu }{\nu -2}, \text { for }\nu >2$
Mode	$\mu$
Defaults	LOCATION=0, DF=3

Table 11.9: Uniform Distribution

PRIOR statement	UNIFORM(MIN=m, MAX=M)
Density	$\frac{1}{M-m}$
Parameter restriction	$-\infty <m<M<\infty$
Range	$\theta \in [m, M]$
Mean	$\frac{m+M}{2}$
Variance	$\frac{(M-m)^2}{12}$
Mode	Not unique
Defaults	MIN $\rightarrow -\infty$ , MAX $\rightarrow \infty$