![]() | ![]() | ![]() | ![]() | ![]() |
Suppose that, in addition to data from a current study, you have similar data from a historical data set or pilot study. A power prior enables you to capture the information collected in the pilot study and incorporate it into analysis and inference in the current study. The historical data gets integrated via an informative prior and provides a convenient way of updating past knowledge.
A power prior can be expressed as a product of the weighted likelihood of
conditional on the historical data and a prior distribution on the parameters before any data is observed. More formally, the power prior distribution of
is written as
![]() |
![]() |
![]() |
where
are the historical data (from the pilot study),
is the likelihood of
conditional on historical data,
is a discounting or scale parameter with
that controls the amount of weight you want to put on the historical data, and
is a prior for
before the historical data
are observed.
The posterior distribution of
is
![]() |
![]() |
![]() |
|||
![]() |
![]() |
![]() |
|||
![]() |
![]() |
![]() |
( 1 ) |
where
represents the data from the current study,
for
,
for
represents the historical data, and
represents the density function for a single observation in either the historical or the current data set.
You can combine the two data sets to form a new data set
and rewrite the posterior distribution in Equation
1
as the following:
![]() |
![]() |
![]() |
( 2 ) | ||
![]() |
![]() |
![]() |
The posterior distribution Equation 2 is used to model the hypothetical clinical trial setting described in this example. The following SAS statements create the PTRIALS data set that represents binomial data collected in a pilot study of a clinical trial.
data ptrials; input y n; datalines; 5 163 ;
In the INPUT statement,
represents the number of successes out of
independent Bernoulli trials. Similarly, for the current study, suppose you have the following hypothetical clinical trials data set
BTRIALS
.
data btrials; input y n @@; datalines; 2 86 2 69 1 71 1 113 1 103 ;
In order to fit a model with a power prior, first combine both the current and pilot study data sets. The indicator GROUP variable takes on the values ' CURRENT ' and ' PILOT ' for the current and pilot study data set, respectively. The following SAS statements demonstrate how to create a new combined data set ALLDATA .
data alldata; set btrials(in=i) ptrials; if i then group='current'; else group = 'pilot'; run;
Suppose you want to fit a Bayesian binomial model with a power prior for a hypothetical clinical trial setting with density
![]() |
![]() |
![]() |
( 3 ) |
where
is the success probability for
observations.
The likelihood function for each of the observations is
![]() |
![]() |
![]() |
( 4 ) |
where
denotes a conditional probability mass function. The binomial density is evaluated at the specified value of
and corresponding success probability
. Then, a power prior is used to assign each observation in the combined data set a binomial or weighted binomial likelihood function depending whether the observation was from the current or
pilot study, respectively.
Suppose the following power prior is placed on the success probability parameters, where
represents the binomial likelihood:
![]() |
![]() |
![]() |
|||
![]() |
![]() |
![]() |
|||
![]() |
![]() |
![]() |
( 5 ) |
The variables
and
represent the historical successes and sample size, respectively, and
indicates a prior distribution. Suppose for this example that the discounting factor is set to
. The diffuse
prior expresses your lack of knowledge about the success probability.
According to Bayes’ theorem, the likelihood function and prior distribution determine the posterior distribution of
as given in Equation
2
. PROC MCMC obtains samples from the desired posterior distribution, which is determined by the specified prior and likelihood. It does not require the exact form of the posterior distribution.
The following SAS statements use the likelihood and power prior distribution to fit the Bayesian binomial model with the historical data. The PROC MCMC statement invokes the procedure and specifies the input data set. The SEED= option specifies a seed for the random number generator, which guarantees the reproducibility of the random stream. The NMC= option specifies the number of posterior simulation iterations.
ods graphics on; proc mcmc data=alldata seed=1181 nmc=10000; parms p 0.2; begincnst; a0 = 0.5; endcnst; prior p ~ uniform(0,1); llike = logpdf('binomial',y,p,n); if (group='pilot') then llike = a0*llike; model general(llike); run; ods graphics off;
The PARMS statement designates
as a parameter in the model and assigns an initial value of 0.2 to it. The programming statements enclosed by the BEGINCNST and ENDCNST statements are executed once per procedure call and designate
the discount factor for the power prior.
The PRIOR statement specifies a prior
as given in Equation
5
. The LLIKE= assignment statement assigns the log-likelihood function for
as given in Equation
3
; the LOGPDF function calculates the logarithm of the binomial function. The IF/THEN statement uses the GROUP indicator variable to weight the binomial likelihood when the
observation is from the pilot study.
The MODEL statement uses the GENERAL function, which indicates that you are using a SAS statement to construct a nonstandard density or distribution. The argument is an expression that takes the value of the logarithm function of the prior or likelihood distribution. In this case, the nonstandard density is the binomial or weighted binomial likelihood, and the argument is the logarithm of the density.
Figure 1
displays convergence diagnostic plots for
to assess whether the Markov chains have converged. Inferences should not be made if the Markov chain has not converged.
The trace plot shows that the mean of the Markov chain is constant over the graph and is stabilized. The chain is able to traverse the support of the target distribution, and the mixing is good. The trace plot shows that the Markov chain appears to have reached a stationary distribution.
The autocorrelation plot indicates low autocorrelation and efficient sampling. The kernel density plot shows a smooth, unimodal posterior marginal distribution for the parameter.
PROC MCMC produces formal diagnostic tests by default, but they are omitted here because informal checks on the chains, autocorrelation, and posterior density plots show desired stabilization and convergence.
Figure 2 reports summary and interval statistics for the parameter’s posterior distribution.
Posterior Summaries | ||||||
---|---|---|---|---|---|---|
Parameter | N | Mean |
Standard
Deviation |
Percentiles | ||
25% | 50% | 75% | ||||
p | 10000 | 0.0200 | 0.00602 | 0.0156 | 0.0194 | 0.0237 |
Posterior Intervals | |||||
---|---|---|---|---|---|
Parameter | Alpha | Equal-Tail Interval | HPD Interval | ||
p | 0.050 | 0.0100 | 0.0333 | 0.00900 | 0.0316 |
Using the power prior, the posterior mean of the success probability is 2.00% with a 95% credible interval of
, providing evidence of a treatment effect.
Modeling the clinical trial data set in a Bayesian binomial model with a power prior is a good way to make use of the pilot study. It enables you to use a different class of informative priors and incorporates relevant historical data and information.
Ibrahim, J. G. and Chen, M.-H. (2000), “Power Prior Distributions for Regression Models,” Statistical Science , 15, 46–60.
These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
data ptrials;
input y n;
datalines;
5 163
;
data btrials;
input y n @@;
datalines;
2 86
2 69
1 71
1 113
1 103
;
data alldata;
set btrials(in=i) ptrials;
if i then group='current';
else group = 'pilot';
run;
ods graphics on;
proc mcmc data=alldata seed=1181 nmc=10000;
parms p 0.2;
begincnst;
a0 = 0.5;
endcnst;
prior p ~ uniform(0,1);
llike = logpdf('binomial',y,p,n);
if (group='pilot') then llike = a0*llike;
model general(llike);
run;
ods graphics off;
These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
Type: | Sample |
Topic: | Analytics ==> Bayesian Analysis SAS Reference ==> Procedures ==> MCMC |
Date Modified: | 2016-08-02 13:31:11 |
Date Created: | 2012-02-06 12:26:32 |
Product Family | Product | Host | SAS Release | |
Starting | Ending | |||
SAS System | SAS/STAT | z/OS | ||
OpenVMS VAX | ||||
Microsoft® Windows® for 64-Bit Itanium-based Systems | ||||
Microsoft Windows Server 2003 Datacenter 64-bit Edition | ||||
Microsoft Windows Server 2003 Enterprise 64-bit Edition | ||||
Microsoft Windows XP 64-bit Edition | ||||
Microsoft® Windows® for x64 | ||||
OS/2 | ||||
Microsoft Windows 95/98 | ||||
Microsoft Windows 2000 Advanced Server | ||||
Microsoft Windows 2000 Datacenter Server | ||||
Microsoft Windows 2000 Server | ||||
Microsoft Windows 2000 Professional | ||||
Microsoft Windows NT Workstation | ||||
Microsoft Windows Server 2003 Datacenter Edition | ||||
Microsoft Windows Server 2003 Enterprise Edition | ||||
Microsoft Windows Server 2003 Standard Edition | ||||
Microsoft Windows Server 2003 for x64 | ||||
Microsoft Windows Server 2008 | ||||
Microsoft Windows Server 2008 for x64 | ||||
Microsoft Windows XP Professional | ||||
Windows 7 Enterprise 32 bit | ||||
Windows 7 Enterprise x64 | ||||
Windows 7 Home Premium 32 bit | ||||
Windows 7 Home Premium x64 | ||||
Windows 7 Professional 32 bit | ||||
Windows 7 Professional x64 | ||||
Windows 7 Ultimate 32 bit | ||||
Windows 7 Ultimate x64 | ||||
Windows Millennium Edition (Me) | ||||
Windows Vista | ||||
Windows Vista for x64 | ||||
64-bit Enabled AIX | ||||
64-bit Enabled HP-UX | ||||
64-bit Enabled Solaris | ||||
ABI+ for Intel Architecture | ||||
AIX | ||||
HP-UX | ||||
HP-UX IPF | ||||
IRIX | ||||
Linux | ||||
Linux for x64 | ||||
Linux on Itanium | ||||
OpenVMS Alpha | ||||
OpenVMS on HP Integrity | ||||
Solaris | ||||
Solaris for x64 | ||||
Tru64 UNIX |