Analyses in the TWOSAMPLEFREQ Statement :: SAS/STAT(R) 12.1 User's Guide

Overview of the $2 \times 2$ Table

Notation:

		Outcome
		Failure	Success
Group	1
	2
			m	N

$\displaystyle x_1$	$\displaystyle = \mbox{\# successes in group 1}$
$\displaystyle x_2$	$\displaystyle = \mbox{\# successes in group 2}$
$\displaystyle m$	$\displaystyle = x_1 + x_2 = \mbox{ total \# successes}$
$\displaystyle \hat{p_1}$	$\displaystyle = \frac{x_1}{n_1}$
$\displaystyle \hat{p_2}$	$\displaystyle = \frac{x_2}{n_2}$
$\displaystyle \hat{p}$	$\displaystyle = \frac{m}{N} = w_1 \hat{p_1} + w_2 \hat{p_2}$

The hypotheses are

$\displaystyle H_0\colon$	$\displaystyle p_2 - p_1 = p_0$
$\displaystyle H_1\colon$	$\displaystyle \left\{ \begin{array}{ll} p_2 - p_1 \ne p_0, & \mbox{two-sided} \\ p_2 - p_1 > p_0, & \mbox{upper one-sided} \\ p_2 - p_1 < p_0, & \mbox{lower one-sided} \\ \end{array} \right.$

where is constrained to be 0 for all but the unconditional Pearson chi-square test.

Internal calculations are performed in terms of , , and . An input set consisting of OR, , and $\mr {OR}_0$ is transformed as follows:

$\displaystyle p_2$	$\displaystyle = \frac{(\mr {OR})p_1}{1-p_1+(\mr {OR})p_1}$
$\displaystyle p_{10}$	$\displaystyle = p_1$
$\displaystyle p_{20}$	$\displaystyle = \frac{\mr {OR}_0 p_{10}}{1 - p_{10} + (\mr {OR}_0)p_{10}}$
$\displaystyle p_0$	$\displaystyle = p_{20} - p_{10}$

An input set consisting of RR, , and $\mr {RR}_0$ is transformed as follows:

$\displaystyle p_2$	$\displaystyle = (\mr {RR})p_1$
$\displaystyle p_{10}$	$\displaystyle = p_1$
$\displaystyle p_{20}$	$\displaystyle = (\mr {RR}_0)p_{10}$
$\displaystyle p_0$	$\displaystyle = p_{20} - p_{10}$

Note that the transformation of either $\mr {OR}_0$ or $\mr {RR}_0$ to is not unique. The chosen parameterization fixes the null value $p_{10}$ at the input value of .

Pearson Chi-Square Test for Two Proportions (TEST=PCHI)

The usual Pearson chi-square test is unconditional. The test statistic

$z_ P = \frac{\hat{p_2} - \hat{p_1} - p_0}{\left[ \hat{p}(1-\hat{p}) \left( \frac{1}{n_1} + \frac{1}{n_2} \right) \right]^\frac {1}{2}} \, = \, \left[ N w_1 w_2 \right]^\frac {1}{2} \frac{\hat{p_2} - \hat{p_1} - p_0}{\hat{p}(1-\hat{p})}$

is assumed to have a null distribution of .

Sample size for the one-sided cases is given by equation (4) in Fleiss, Tytun, and Ury (1980). One-sided power is computed as suggested by Diegert and Diegert (1981) by inverting the sample size formula. Power for the two-sided case is computed by adding the lower-sided and upper-sided powers each with $\alpha /2$ , and sample size for the two-sided case is obtained by numerically inverting the power formula. A custom null value for the proportion difference is also supported.

$\mr {power} = \left\{ \begin{array}{ll} \Phi \left( \frac{(p_2 - p_1 - p_0) (N w_1 w_2)^\frac {1}{2} - z_{1-\alpha } \left[ (w_1 p_1 + w_2 p_2) (1 - w_1 p_1 - w_2 p_2) \right]^\frac {1}{2}}{\left[ w_2 p_1 (1 - p_1) + w_1 p_2 (1 - p_2) \right]^\frac {1}{2}} \right), & \mbox{upper one-sided} \\ \Phi \left( \frac{-(p_2 - p_1 - p_0) (N w_1 w_2)^\frac {1}{2} - z_{1-\alpha } \left[ (w_1 p_1 + w_2 p_2) (1 - w_1 p_1 - w_2 p_2) \right]^\frac {1}{2}}{\left[ w_2 p_1 (1 - p_1) + w_1 p_2 (1 - p_2) \right]^\frac {1}{2}} \right), & \mbox{lower one-sided} \\ \Phi \left( \frac{(p_2 - p_1 - p_0) (N w_1 w_2)^\frac {1}{2} - z_{1-\frac{\alpha }{2}} \left[ (w_1 p_1 + w_2 p_2) (1 - w_1 p_1 - w_2 p_2) \right]^\frac {1}{2}}{\left[ w_2 p_1 (1 - p_1) + w_1 p_2 (1 - p_2) \right]^\frac {1}{2}} \right) + \\ \quad \Phi \left( \frac{-(p_2 - p_1 - p_0) (N w_1 w_2)^\frac {1}{2} - z_{1-\frac{\alpha }{2}} \left[ (w_1 p_1 + w_2 p_2) (1 - w_1 p_1 - w_2 p_2) \right]^\frac {1}{2}}{\left[ w_2 p_1 (1 - p_1) + w_1 p_2 (1 - p_2) \right]^\frac {1}{2}} \right), & \mbox{two-sided} \\ \end{array} \right.$

For the one-sided cases, a closed-form inversion of the power equation yield an approximate total sample size

$N = \frac{ \left[ z_{1-\alpha } \left\{ (w_1 p_1 + w_2 p_2) (1 - w_1 p_1 - w_2 p_2) \right\} ^\frac {1}{2} + z_{\mr {power}} \left\{ w_2 p_1 (1 - p_1) + w_1 p_2 (1 - p_2) \right\} ^\frac {1}{2} \right]^2 }{ w_1 w_2 (p_2 - p_1 - p_0)^2 }$

For the two-sided case, the solution for N is obtained by numerically inverting the power equation.

Likelihood Ratio Chi-Square Test for Two Proportions (TEST=LRCHI)

The usual likelihood ratio chi-square test is unconditional. The test statistic

$z_{\mr {LR}} = (-1_{\{ p_2 < p_1\} })\sqrt {2N \sum _{i=1}^2 \left[ w_ i \hat{p_ i} \log \left( \frac{\hat{p_ i}}{\hat{p}} \right) + w_ i (1-\hat{p_ i}) \log \left( \frac{1-\hat{p_ i}}{1-\hat{p}} \right) \right]}$

is assumed to have a null distribution of and an alternative distribution of $N(\delta ,1)$ , where

$\delta = N^\frac {1}{2} (-1_{\{ p_2 < p_1\} })\sqrt {2 \sum _{i=1}^2 \left[ w_ i p_ i \log \left( \frac{p_ i}{w_1 p_1 + w_2 p_2} \right) + w_ i (1-p_ i) \log \left( \frac{1-p_ i}{1-(w_1 p_1 + w_2 p_2)} \right) \right]}$

The approximate power is

$\mr {power} = \left\{ \begin{array}{ll} \Phi \left( \delta - z_{1-\alpha } \right), & \mbox{upper one-sided} \\ \Phi \left( - \delta - z_{1-\alpha } \right), & \mbox{lower one-sided} \\ \Phi \left( \delta - z_{1-\frac{\alpha }{2}} \right) + \Phi \left( - \delta - z_{1-\frac{\alpha }{2}} \right), & \mbox{two-sided} \\ \end{array} \right. \\$

For the one-sided cases, a closed-form inversion of the power equation yield an approximate total sample size

$N = \left( \frac{z_{\mr {power}} + z_{1-\alpha }}{\delta } \right)^2$

For the two-sided case, the solution for N is obtained by numerically inverting the power equation.

Fisher’s Exact Conditional Test for Two Proportions (Test=FISHER)

Fisher’s exact test is conditional on the observed total number of successes m. Power and sample size computations are based on a test with similar power properties, the continuity-adjusted arcsine test. The test statistic

$\displaystyle z_ A$	$\displaystyle = (4N w_1 w_2)^\frac {1}{2} \left[ \mr {arcsin}\left( \left[ \hat{p_2} + \frac{1}{2N w_2} (1_{\{ \hat{p_2} < \hat{p_1}\} } - 1_{\{ \hat{p_2} > \hat{p_1}\} }) \right]^\frac {1}{2} \right) \right.$
$\displaystyle$	$\displaystyle \quad \left. - \mr {arcsin}\left( \left[ \hat{p_1} + \frac{1}{2N w_1} (1_{\{ \hat{p_1} < \hat{p_2}\} } - 1_{\{ \hat{p_1} > \hat{p_2}\} }) \right]^\frac {1}{2} \right) \right]$

is assumed to have a null distribution of and an alternative distribution of $N(\delta ,1)$ , where

$\displaystyle \delta$	$\displaystyle = (4N w_1 w_2)^\frac {1}{2} \left[ \mr {arcsin}\left( \left[ p_2 + \frac{1}{2N w_2} (1_{\{ p_2 < p_1\} } - 1_{\{ p_2 > p_1\} }) \right]^\frac {1}{2} \right) \right.$
$\displaystyle$	$\displaystyle \quad \left. - \mr {arcsin}\left( \left[ p_1 + \frac{1}{2N w_1} (1_{\{ p_1 < p_2\} } - 1_{\{ p_1 > p_2\} }) \right]^\frac {1}{2} \right) \right]$

The approximate power for the one-sided balanced case is given by Walters (1979) and is easily extended to the unbalanced and two-sided cases:

$\mr {power} = \left\{ \begin{array}{ll} \Phi \left( \delta - z_{1-\alpha } \right), & \mbox{upper one-sided} \\ \Phi \left( - \delta - z_{1-\alpha } \right), & \mbox{lower one-sided} \\ \Phi \left( \delta - z_{1-\frac{\alpha }{2}} \right) + \Phi \left( - \delta - z_{1-\frac{\alpha }{2}} \right), & \mbox{two-sided} \\ \end{array} \right. \\$

The POWER Procedure

Analyses in the TWOSAMPLEFREQ Statement

Overview of the Table

Pearson Chi-Square Test for Two Proportions (TEST=PCHI)

Likelihood Ratio Chi-Square Test for Two Proportions (TEST=LRCHI)

Fisher’s Exact Conditional Test for Two Proportions (Test=FISHER)

Overview of the $2 \times 2$ Table