PROC GENMOD: Bayesian Analysis :: SAS/STAT(R) 9.2 User's Guide, Second Edition

The GENMOD Procedure

Bayesian Analysis

Gibbs Sampling

This section provides details for Bayesian analysis by Gibbs sampling in generalized linear models. See the section Gibbs Sampler for a general discussion of Gibbs sampling. See Gilks, Richardson, and Spiegelhalter (1996) for a discussion of applications of Gibbs sampling to a number of different models, including generalized linear models. In generalized linear models, the response has a probability distribution from a family of distributions of the exponential form. That is, the probability density of the response $\text{[math]}$ for continuous response variables, or the probability function for discrete responses, can be expressed as

$\text{[math]}$

for some functions $\text{[math]}$ , $\text{[math]}$ , and $\text{[math]}$ that determine the specific distribution. The canonical parameters $\text{[math]}$ depend only on the means of the response $\text{[math]}$ , which are related to the regression parameters $\text{[math]}$ through the link function $\text{[math]}$ . The additional parameter $\text{[math]}$ is the dispersion parameter. The GENMOD procedure estimates the regression parameters and the scale parameter $\text{[math]}$ by maximum likelihood. However, the GENMOD procedure can also provide Bayesian estimates of the regression parameters and either the scale $\text{[math]}$ , the dispersion $\text{[math]}$ , or the precision $\text{[math]}$ by Gibbs sampling. Except where noted, the following discussion applies to either $\text{[math]}$ , $\text{[math]}$ , or $\text{[math]}$ , although $\text{[math]}$ is used to illustrate the formulas. Note that the Poisson and binomial distributions do not have a dispersion parameter, and the dispersion is considered to be fixed at $\text{[math]}$ . The ASSESS, CONTRAST, ESTIMATE, OUTPUT, and REPEATED statements, if specified, are ignored. Also ignored are the PLOTS= option in the PROC GENMOD statement and the following options in the MODEL statement: ALPHA=, CORRB, COVB, TYPE1, TYPE3, SCALE=DEVIANCE (DSCALE), SCALE=PEARSON (PSCALE), OBSTATS, RESIDUALS, XVARS, PREDICTED, DIAGNOSTICS, and SCALE= for Poisson and binomial distributions. The multinomial and zero-inflated Poisson distributions are not available for Bayesian analysis.

Let $\text{[math]}$ be the parameter vector. For generalized linear models, the $\text{[math]}$ s are the regression coefficients $\text{[math]}$ s and the dispersion parameter $\text{[math]}$ . Let $\text{[math]}$ be the likelihood function, where $\text{[math]}$ is the observed data. Let $\text{[math]}$ be the prior distribution. The full conditional distribution of $\text{[math]}$ is proportional to the joint distribution; that is,

$\text{[math]}$

For instance, the one-dimensional conditional distribution of $\text{[math]}$ given $\text{[math]}$ , is computed as

$\text{[math]}$

Suppose you have a set of arbitrary starting values $\text{[math]}$ . Using the ARMS (adaptive rejection Metropolis sampling) algorithm of Gilks and Wild (1992) and Gilks, Best, and Tan (1995), you can do the following:

draw $\text{[math]}$ from $\text{[math]}$
draw $\text{[math]}$ from $\text{[math]}$
$\text{[math]}$
draw $\text{[math]}$ from $\text{[math]}$

This completes one iteration of the Gibbs sampler. After one iteration, you have $\text{[math]}$ . After $\text{[math]}$ iterations, you have $\text{[math]}$ . PROC GENMOD implements the ARMS algorithm provided by Gilks (2003) to draw a sample from a full conditional distribution. See the section Assessing Markov Chain Convergence for information about assessing the convergence of the chain of posterior samples.

You can output these posterior samples into a SAS data set through ODS. The following SAS statement outputs the posterior samples into the SAS data set Post:

OUTPOST=Post

The data set also includes the variable LogPost, representing the log of the posterior log likelihood.

Priors for Model Parameters

The model parameters are the regression coefficients and the dispersion parameter (or the precision or scale), if the model has one. The priors for the dispersion parameter and the priors for the regression coefficients are assumed to be independent, while you can have a joint multivariate normal prior for the regression coefficients.

Dispersion, Precision, or Scale Parameter

Gamma Prior

The gamma distribution $\text{[math]}$ has a PDF

$\text{[math]}$

where $\text{[math]}$ is the shape parameter and $\text{[math]}$ is the inverse-scale parameter. The mean is $\text{[math]}$ and the variance is $\text{[math]}$ .

Improper Prior

The joint prior density is given by

$\text{[math]}$

Inverse Gamma Prior

The inverse gamma distribution $\text{[math]}$ has a PDF

$\text{[math]}$

where $\text{[math]}$ is the shape parameter and $\text{[math]}$ is the scale parameter. The mean is $\text{[math]}$ if $\text{[math]}$ , and the variance is $\text{[math]}$ if $\text{[math]}$ .

Regression Coefficients

Let $\text{[math]}$ be the regression coefficients.

Jeffreys’ Prior

The joint prior density is given by

$\text{[math]}$

where $\text{[math]}$ is the Fisher information matrix for the model. If the underlying model has a scale parameter (for example, a normal linear regression model), then the Fisher information matrix is computed with the scale parameter set to a fixed value of one.

If you specify the CONDITIONAL option, then Jeffreys’ prior, conditional on the current Markov chain value of the generalized linear model precision parameter $\text{[math]}$ , is given by

$\text{[math]}$

where $\text{[math]}$ is the model precision parameter.

See Ibrahim and Laud (1991) for a full discussion, with examples, of Jeffreys’ prior for generalized linear models.

Normal Prior

Assume $\text{[math]}$ has a multivariate normal prior with mean vector $\text{[math]}$ and covariance matrix $\text{[math]}$ . The joint prior density is given by

$\text{[math]}$

If you specify the CONDITIONAL option, then, conditional on the current Markov chain value of the generalized linear model precision parameter $\text{[math]}$ , the joint prior density is given by

$\text{[math]}$

Uniform Prior

The joint prior density is given by

$\text{[math]}$

Deviance Information Criterion

Let $\text{[math]}$ be the model parameters at iteration $\text{[math]}$ of the Gibbs sampler and let LL( $\text{[math]}$ ) be the corresponding model log likelihood. PROC GENMOD computes the following fit statistics defined by Spiegelhalter et al. (2002):

Effective number of parameters:

$\text{[math]}$
Deviance information criterion (DIC):

$\text{[math]}$

where

$\text{[math]}$

PROC GENMOD uses the full log likelihoods defined in the section Log-Likelihood Functions, with all terms included, for computing the DIC.

Posterior Distribution

Denote the observed data by $\text{[math]}$ .

The posterior distribution is

$\text{[math]}$

where $\text{[math]}$ is the likelihood function with regression coefficients $\text{[math]}$ as parameters.

Starting Values of the Markov Chains

When the BAYES statement is specified, PROC GENMOD generates one Markov chain containing the approximate posterior samples of the model parameters. Additional chains are produced when the Gelman-Rubin diagnostics are requested. Starting values (or initial values) can be specified in the INITIAL= data set in the BAYES statement. If INITIAL= option is not specified, PROC GENMOD picks its own initial values for the chains.

Denote $\text{[math]}$ as the integral value of x. Denote $\text{[math]}$ as the estimated standard error of the estimator $\text{[math]}$ .

Regression Coefficients

For the first chain that the summary statistics and regression diagnostics are based on, the default initial values are estimates of the mode of the posterior distribution. If the INITIALMLE option is specified, the initial values are the maximum likelihood estimates; that is,

$\text{[math]}$

Initial values for the $\text{[math]}$ th chain ( $\text{[math]}$ ) are given by

$\text{[math]}$

with the plus sign for odd $\text{[math]}$ and minus sign for even $\text{[math]}$ .

Dispersion, Scale, or Precision Parameter $\text{[math]}$

Let $\text{[math]}$ be the generalized linear model parameter you choose to sample, either the dispersion, scale, or precision parameter. Note that the Poisson and binomial distributions do not have this additional parameter.

$\text{[math]}$

The initial values of the $\text{[math]}$ th chain ( $\text{[math]}$ ) are given by

$\text{[math]}$

with the plus sign for odd $\text{[math]}$ and minus sign for even $\text{[math]}$ .

OUTPOST= Output Data Set

The OUTPOST= data set contains the generated posterior samples. There are 2+ $\text{[math]}$ variables, where $\text{[math]}$ is the number of model parameters. The variable Iteration represents the iteration number and the variable LogPost contains the log posterior likelihood values. The other $\text{[math]}$ variables represent the draws of the Markov chain for the model parameters.

Top of Page