Functions


BAYESACT Call

Subsections:

computes posterior probabilities that observations are contaminated with a larger variance.

Syntax

CALL BAYESACT$(k,s,df,\alpha _1,\ldots ,\alpha _ n,y_1,\ldots ,y_ n, \beta _1,\ldots ,\beta _ n,p_0);$

where

k

is the contamination coefficient, where $k \geq 1$.

s

is an independent estimate of $\sigma $, where $s\geq 0$.

df

is the number of degrees of freedom for s, where $df\geq 0$.

$\alpha _ i$

is the prior probability of contamination for the ith observation in the sample, where $i=1,\ldots ,n$ and n is the number of observations in the sample. Note that $0\leq \alpha _ i\leq 1$.

$y_ i$

is the ith observation in the sample, where $i=1,\ldots ,n$ and n is the number of observations in the sample. When the BAYESACT call is used to perform a Bayes analysis of designs (see "Description" below), the $y_ i$s are estimates for effects.

$\beta _ i$

is the variable that contains the returned posterior probability of contamination for the ith observation in the sample, where $i=1,\ldots ,n$ and n is the number of observations in the sample.

$p_0$

is the variable that contains the posterior probability that the sample is uncontaminated.

Description

The BAYESACT call computes posterior probabilities ($\beta _ i$) that observations in a sample are contaminated with a larger variance than other observations and computes the posterior probability ($p_0$) that the entire sample is uncontaminated.

Specifically, the BAYESACT call assumes a normal random sample of n independent observations, with a mean of 0 (a centered sample) where some of the observations may have a larger variance than others:

\[ \mbox{Var}(y_ i) = \left\{ \begin{array}{ll} \sigma ^2 & \mbox{with probability } 1-\alpha _ i \\ k^2\sigma ^2 & \mbox{with probability } \alpha _ i \end{array} \right. \]

where $i=1,\ldots ,n$. The parameter k is called the contamination coefficient. The value of $\alpha _ i$ is the prior probability of contamination for the ith observation. Based on the prior probability of contamination for each observation, the call gives the posterior probability of contamination for each observation and the posterior probability that the entire sample is uncontaminated.

Box and Meyer (1986) suggest computing posterior probabilities of contamination for the analysis of saturated orthogonal factorial designs. Although these designs give uncorrelated estimates for effects, the significance of effects cannot be tested in an analysis of variance since there are no degrees of freedom for error. Box and Meyer suggest computing posterior probabilities of contamination for the effect estimates. The prior probabilities ($\alpha _ i$) give the likelihood that an effect will be significant, and the contamination coefficient (k) gives a measure of how large the significant effect will be. Box and Meyer recommend using $\alpha =0.2$ and k = 10, implying that about 1 in 5 effects will be about 10 times larger than the remaining effects. To adequately explore posterior probabilities, examine them over a range of values for prior probabilities and a range of contamination coefficients.

If an independent estimate of $\sigma $ is unavailable (as is the case when the $y_ i$s are effects from a saturated orthogonal design), use 0 for s and df in the BAYESACT call. Otherwise, the call assumes s is proportional to the square root of a $\chi ^2$ random variable with df degrees of freedom. For example, if the $y_ i$s are estimated effects from an orthogonal design that is not saturated, then use the BAYESACT call with s equal to the estimated standard error of the estimates and df equal to the degrees of freedom for error.

From Bayes’ theorem, the posterior probability that $y_ i$ is contaminated is

\[ \beta _ i(\sigma )=\frac{\alpha _ if(y_ i;0,k^2\sigma ^2) }{\alpha _ if(y_ i;0,k^2\sigma ^2) + (1-\alpha _ i)f(y_ i;0,\sigma ^2) } \]

for a given value of $\sigma $, where $f(x;\mu ,\sigma )$ is the density of a normal distribution with mean $\mu $ and variance $\sigma ^2$.

The probability that the sample is uncontaminated is

\[ p=\prod _{i=1}^{n}(1-\beta _ i(\sigma )) \]

Posterior probabilities that are independent of $\sigma $ are derived by integrating $\beta _ i(\sigma )$ and p over a noninformative prior for $\sigma $. If an estimate of $\sigma $ is available (when df > 0), it is appropriately incorporated. Refer to Box and Meyer (1986) for details.

Examples

The statements

data;
   retain post1-post7 postnone;
   call bayesact(10,0,0,
          0.2,   0.2,   0.2,   0.2,   0.2,     0.2,   0.2,
      -5.4375,1.3875,8.2875,0.2625,1.7125,-11.4125,1.5875,
        post1, post2, post3, post4, post5,   post6, post7,
      postnone);
run;

return the following posterior probabilities:

POST1      0.42108
POST2      0.037412
POST3      0.53438
POST4      0.024679
POST5      0.050294
POST6      0.64329
POST7      0.044408
POSTNONE   0.28621

The probability that the sample is uncontaminated is 0.28621. A situation where this BAYESACT call would be appropriate is a saturated $2^7$ design in 8 runs, where the estimates for main effects are as shown in the function above (-5.4375, 1.3875, . . . , 1.5875).