This example illustrates how you can use PROC MCMC to fit random-effects models. In the example Random-Effects Model in Getting Started: MCMC Procedure, you already saw PROC MCMC fit a linear random-effects model. This example shows how to fit a logistic random-effects model in PROC MCMC. Although you can use PROC MCMC to analyze random-effects models, you might want to first consider some other SAS procedures. For example, you can use PROC MIXED (see Chapter 58, The MIXED Procedure ) to analyze linear mixed effects models, PROC NLMIXED (see Chapter 63, The NLMIXED Procedure ) for nonlinear mixed effects models, and PROC GLIMMIX (see Chapter 40, The GLIMMIX Procedure ) for generalized linear mixed effects models. In addition, a sampling-based Bayesian analysis is available in the MIXED procedure through the PRIOR statement (see PRIOR Statement).
The data are taken from Crowder (1978). The Seeds data set is a factorial layout, with two types of seeds, O. aegyptiaca 75 and O. aegyptiaca 73, and two root extracts, bean and cucumber. You observe r, which is the number of germinated seeds, and n, which is the total number of seeds. The independent variables are seed and extract.
The following statements create the data set:
title 'Logistic Regression Random-Effects Model'; data seeds; input r n seed extract @@; ind = _N_; datalines; 10 39 0 0 23 62 0 0 23 81 0 0 26 51 0 0 17 39 0 0 5 6 0 1 53 74 0 1 55 72 0 1 32 51 0 1 46 79 0 1 10 13 0 1 8 16 1 0 10 30 1 0 8 28 1 0 23 45 1 0 0 4 1 0 3 12 1 1 22 41 1 1 15 30 1 1 32 51 1 1 3 7 1 1 ;
You can model each observation as having its own probability of success , and the likelihood is as follows:
You can use the logit link function to link the covariates of each observation, seed and extract, to the probability of success,
where is assumed to be an i.i.d. random effect with a normal prior:
The four regression coefficients and the standard deviation in the random effects are model parameters; they are given noninformative priors as follows:
Another way of expressing the same model is as
where
The two models are equivalent. In the first model, the random effects centers at 0 in the normal distribution, and in the second model, centers at the regression mean. This hierarchical centering can sometimes improve mixing.
The following statements fit the second model and generate Output 54.7.1:
proc mcmc data=seeds outpost=postout seed=332786 nmc=20000; ods select PostSummaries PostIntervals; parms beta0 0 beta1 0 beta2 0 beta3 0 s2 1; prior s2 ~ igamma(0.01, s=0.01); prior beta: ~ general(0); w = beta0 + beta1*seed + beta2*extract + beta3*seed*extract; random delta ~ normal(w, var=s2) subject=ind; pi = logistic(delta); model r ~ binomial(n = n, p = pi); run;
The PROC MCMC statement specifies the input and output data sets, sets a seed for the random number generator, and requests a large simulation size. The ODS SELECT statement displays the summary statistics and interval statistics tables. The PARMS statement declares the model parameters, and the PRIOR statements specify the prior distributions for and .
The symbol w calculates the regression mean, and the RANDOM statement specifies the random effect, with a normal prior distribution, centered at w with variance . Note that the variable w is a function of the input data set variables. You can use data set variable in constructing the hyperparameters of the random-effects parameters, as long as the hyperparameters remain constant within each subject group. The SUBJECT= option indicates the group index for the random-effects parameters.
The symbol pi is the logit transformation. The MODEL specifies the response variable r as a binomial distribution with parameters n and pi.
Output 54.7.1 lists the posterior mean and interval estimates of the regression parameters.
Logistic Regression Random-Effects Model |
Posterior Summaries | ||||||
---|---|---|---|---|---|---|
Parameter | N | Mean | Standard Deviation |
Percentiles | ||
25% | 50% | 75% | ||||
beta0 | 20000 | -0.5488 | 0.2045 | -0.6731 | -0.5552 | -0.4216 |
beta1 | 20000 | 0.0563 | 0.3218 | -0.1487 | 0.0719 | 0.2690 |
beta2 | 20000 | 1.3590 | 0.2977 | 1.1691 | 1.3533 | 1.5325 |
beta3 | 20000 | -0.8214 | 0.4504 | -1.1124 | -0.8106 | -0.5277 |
s2 | 20000 | 0.1171 | 0.0950 | 0.0531 | 0.0933 | 0.1530 |
Posterior Intervals | |||||
---|---|---|---|---|---|
Parameter | Alpha | Equal-Tail Interval | HPD Interval | ||
beta0 | 0.050 | -0.9556 | -0.1350 | -0.9585 | -0.1427 |
beta1 | 0.050 | -0.6177 | 0.6785 | -0.5749 | 0.7146 |
beta2 | 0.050 | 0.7441 | 1.9817 | 0.7563 | 1.9865 |
beta3 | 0.050 | -1.7739 | 0.0413 | -1.7778 | 0.0251 |
s2 | 0.050 | 0.0132 | 0.3645 | 0.00253 | 0.2927 |