The FREQ Procedure

Risks and Risk Differences

The RISKDIFF option in the TABLES statement provides estimates of risks (binomial proportions) and risk differences for $2 \times 2$ tables. This analysis might be appropriate when comparing the proportion of some characteristic for two groups, where row 1 and row 2 correspond to the two groups, and the columns correspond to two possible characteristics or outcomes. For example, the row variable might be a treatment or dose, and the column variable might be the response. For more information, see Collett (1991); Fleiss, Levin, and Paik (2003); Stokes, Davis, and Koch (2012).

Let the frequencies of the $2 \times 2$ table be represented as follows.

 

Column 1

Column 2

Total

Row 1

$n_{11}$

$n_{12}$

$n_{1 \cdot }$

Row 2

$n_{21}$

$n_{22}$

$n_{2 \cdot }$

Total

$n_{\cdot 1}$

$n_{\cdot 2}$

n

By default when you specify the RISKDIFF option, PROC FREQ provides estimates of the row 1 risk (proportion), the row 2 risk, the overall risk, and the risk difference for column 1 and for column 2 of the $2 \times 2$ table. The risk difference is defined as the row 1 risk minus the row 2 risk. The risks are binomial proportions of their rows (row 1, row 2, or overall), and the computation of their standard errors and Wald confidence limits follow the binomial proportion computations, which are described in the section Binomial Proportion.

The column 1 risk for row 1 is the proportion of row 1 observations classified in column 1,

\[ \hat{p}_1 = n_{11} ~ / ~ n_{1 \cdot } \]

which estimates the conditional probability of the column 1 response, given the first level of the row variable. The column 1 risk for row 2 is the proportion of row 2 observations classified in column 1,

\[ \hat{p}_2 = n_{21} ~ / ~ n_{2 \cdot } \]

The overall column 1 risk is the proportion of all observations classified in column 1,

\[ \hat{p} = n_{\cdot 1} ~ / ~ n \]

The column 1 risk difference compares the risks for the two rows, and it is computed as the column 1 risk for row 1 minus the column 1 risk for row 2,

\[ \hat{d} = \hat{p}_1 - \hat{p}_2 \]

The standard error of the column 1 risk for row i is computed as

\[ \mr{se}(\hat{p}_ i) = \sqrt { \hat{p}_ i ~ ( 1 - \hat{p}_ i ) ~ / ~ n_{i \cdot } } \]

The standard error of the overall column 1 risk is computed as

\[ \mr{se}(\hat{p}) = \sqrt { \hat{p} ~ ( 1 - \hat{p} ) ~ / ~ n } \]

Where the two rows represent independent binomial samples, the standard error of the column 1 risk difference is computed as

\[ \mr{se}(\hat{d}) = \sqrt { \hat{p}_1 (1 - \hat{p}_1) / n_{1 \cdot } ~ + ~ \hat{p}_2 (1 - \hat{p}_2) / n_{2 \cdot }} \]

The computations are similar for the column 2 risks and risk difference.

Confidence Limits

By default, the RISKDIFF option provides Wald asymptotic confidence limits for the risks (row 1, row 2, and overall) and the risk difference. By default, the RISKDIFF option also provides exact (Clopper-Pearson) confidence limits for the risks. You can suppress the display of this information by specifying the NORISKS riskdiff-option. You can specify riskdiff-options to request tests and other types of confidence limits for the risk difference. For more information, see the sections Confidence Limits for the Risk Difference and Risk Difference Tests.

The risks are equivalent to the binomial proportions of their corresponding rows. This section describes the Wald confidence limits that are provided by default when you specify the RISKDIFF option. The BINOMIAL option provides additional confidence limit types and tests for risks (binomial proportions). For more information, see the sections Binomial Confidence Limits and Binomial Tests.

The Wald confidence limits are based on the normal approximation to the binomial distribution. PROC FREQ computes the Wald confidence limits for the risks and risk differences as

\[ \mr{Est} ~ \pm ~ (~ z_{\alpha /2} \times \mr{se}(\mr{Est}) ~ ) \]

where Est is the estimate, $z_{\alpha /2}$ is the $100(1-\alpha /2)$th percentile of the standard normal distribution, and $\mr{se}(\mr{Est})$ is the standard error of the estimate. The confidence level $\alpha $ is determined by the value of the ALPHA= option; by default, ALPHA=0.05, which produces 95% confidence limits.

If you specify the CORRECT riskdiff-option, PROC FREQ includes continuity corrections in the Wald confidence limits for the risks and risk differences. The purpose of a continuity correction is to adjust for the difference between the normal approximation and the binomial distribution, which is discrete. See Fleiss, Levin, and Paik (2003) for more information. The continuity-corrected Wald confidence limits are computed as

\[ \mr{Est} ~ \pm ~ (~ z_{\alpha /2} \times \mr{se}(\mr{Est}) + \mathit{cc} ~ ) \]

where cc is the continuity correction. For the row 1 risk, $\mathit{cc} = (1/2n_{1 \cdot })$; for the row 2 risk, $\mathit{cc} = (1/2n_{2 \cdot })$; for the overall risk, $\mathit{cc} = (1/2n)$; and for the risk difference, $\mathit{cc} = ((1/n_{1 \cdot } + 1/n_{2 \cdot })/2)$. The column 1 and column 2 risks use the same continuity corrections.

By default when you specify the RISKDIFF option, PROC FREQ also provides exact (Clopper-Pearson) confidence limits for the column 1, column 2, and overall risks. These confidence limits are constructed by inverting the equal-tailed test that is based on the binomial distribution. For more information, see the section Exact (Clopper-Pearson) Confidence Limits.

Confidence Limits for the Risk Difference

PROC FREQ provides the following confidence limit types for the risk difference: Agresti-Caffo, exact unconditional, Hauck-Anderson, Miettinen-Nurminen (score), Newcombe (hybrid-score), and Wald confidence limits. Continuity-corrected forms of Newcombe and Wald confidence limits are also available.

The confidence coefficient for the confidence limits produced by the CL= riskdiff-option is $100(1-\alpha )$%, where the value of $\alpha $ is determined by the ALPHA= option. By default, ALPHA=0.05, which produces 95% confidence limits. This differs from the test-based confidence limits that are provided with the equivalence, noninferiority, and superiority tests, which have a confidence coefficient of $100(1-2\alpha )$% (Schuirmann 1999). For more information, see the section Risk Difference Tests.

Agresti-Caffo Confidence Limits
Agresti-Caffo confidence limits for the risk difference are computed as

\[ \tilde{d} ~ \pm ~ ( ~ z_{\alpha /2} \times \mr{se}(\tilde{d}) ~ ) \]

where $\tilde{d} = \tilde{p}_1 - \tilde{p}_2$, $\tilde{p}_ i = ( n_{i1} + 1 ) / ( n_{i \cdot } + 2 )$,

\[ \mr{se}(\tilde{d}) = \sqrt { \tilde{p}_1 ( 1 - \tilde{p}_2 ) / ( n_{1 \cdot } + 2 ) ~ + ~ \tilde{p}_2 ( 1 - \tilde{p}_2 ) / ( n_{2 \cdot } + 2 ) } \]

and $z_{\alpha /2}$ is the $100(1-\alpha /2)$th percentile of the standard normal distribution.

The Agresti-Caffo interval adjusts the Wald interval for the risk difference by adding a pseudo-observation of each type (success and failure) to each sample. See Agresti and Caffo (2000) and Agresti and Coull (1998) for more information.

Hauck-Anderson Confidence Limits
Hauck-Anderson confidence limits for the risk difference are computed as

\[ \hat{d} ~ \pm ~ ( ~ \mathit{cc} ~ + ~ z_{\alpha /2} \times \mr{se}(\hat{d}) ~ ) \]

where $\hat{d} = \hat{p}_1 - \hat{p}_2$ and $z_{\alpha /2}$ is the $100(1-\alpha /2)$th percentile of the standard normal distribution. The standard error is computed from the sample proportions as

\[ \mr{se}(\hat{d}) = \sqrt { \hat{p}_1 (1-\hat{p}_1) / (n_{1 \cdot }-1) ~ +~ \hat{p}_2 (1-\hat{p}_2) / (n_{2 \cdot }-1) } \]

The Hauck-Anderson continuity correction cc is computed as

\[ \mathit{cc} = 1 ~ / ~ \bigl ( 2 ~ \min ( n_{1 \cdot }, ~ n_{2 \cdot } ) \bigr ) \]

See Hauck and Anderson (1986) for more information. The subsection "Hauck-Anderson Test" in the section Noninferiority Tests describes the corresponding noninferiority test.

Miettinen-Nurminen (Score) Confidence Limits
Miettinen-Nurminen (score) confidence limits for the risk difference (Miettinen and Nurminen 1985) are computed by inverting score tests for the risk difference. A score-based test statistic for the null hypothesis that the risk difference equals $\delta $ can be expressed as

\[ T(\delta ) = ( \hat{d} - \delta ) / \sqrt { \widetilde{\mr{Var}}(\delta }) \]

where $\hat{d}$ is the observed value of the risk difference ($\hat{p}_1 - \hat{p}_2$),

\[ \widetilde{\mr{Var}}(\delta ) = \left( n / (n-1) \right) ~ \left( ~ \tilde{p}_1(\delta ) ( 1 - \tilde{p}_1(\delta ) ) / n_1 + \tilde{p}_2(\delta ) ( 1 - \tilde{p}_2(\delta ) ) / n_2 ~ \right) \]

and $\tilde{p}_1(\delta )$ and $\tilde{p}_2(\delta )$ are the maximum likelihood estimates of the row 1 and row 2 risks (proportions) under the restriction that the risk difference is $\delta $. For more information, see Miettinen and Nurminen (1985, pp. 215–216) and Miettinen (1985, chapter 12).

The $100(1-\alpha )$% confidence interval for the risk difference consists of all values of $\delta $ for which the score test statistic $T(\delta )$ falls in the acceptance region,

\[ \{ \delta : T(\delta ) < z_{\alpha /2} \} \]

where $z_{\alpha /2}$ is the $100(1-\alpha /2)$th percentile of the standard normal distribution. PROC FREQ finds the confidence limits by iterative computation, which stops when the iteration increment falls below the convergence criterion or when the maximum number of iterations is reached, whichever occurs first. By default, the convergence criterion is 0.00000001 and the maximum number of iterations is 100.

By default, the Miettinen-Nurminen confidence limits include the bias correction factor $n/(n-1)$ in the computation of $\widetilde{\mr{Var}}(\delta )$ (Miettinen and Nurminen 1985, p. 216). For more information, see Newcombe and Nurminen (2011). If you specify the CL=MN(CORRECT=NO) riskdiff-option, PROC FREQ does not include the bias correction factor in this computation (Mee 1984). See also Agresti (2002, p. 77). The uncorrected confidence limits are labeled as "Miettinen-Nurminen-Mee" confidence limits in the displayed output.

The maximum likelihood estimates of $p_1$ and $p_2$, subject to the constraint that the risk difference is $\delta $, are computed as

\[ \tilde{p}_1 = 2 u \cos (w) - b/3a \hspace{.15in} \mr{and} \hspace{.15in} \tilde{p}_2 = \tilde{p}_1 + \delta \]

where

\begin{eqnarray*} w & = & ( \pi + \cos ^{-1}(v / u^3) ) / 3 \\ v & = & b^3 / (3a)^3 - bc/6a^2 + d/2a \\ u & = & \mr{sign}(v) \sqrt {b^2 / (3a)^2 - c/3a} \\ a & = & 1 + \theta \\ b & = & - \left( 1 + \theta + \hat{p}_1 + \theta \hat{p}_2 + \delta (\theta + 2) \right) \\ c & = & \delta ^2 + \delta (2 \hat{p}_1 + \theta + 1) + \hat{p}_1 + \theta \hat{p}_2 \\ d & = & -\hat{p}_1 \delta (1 + \delta ) \\ \theta & = & n_{2 \cdot } / n_{1 \cdot } \end{eqnarray*}

For more information, see Farrington and Manning (1990, p. 1453).

Newcombe Confidence Limits
Newcombe (hybrid-score) confidence limits for the risk difference are constructed from the Wilson score confidence limits for each of the two individual proportions. The confidence limits for the individual proportions are used in the standard error terms of the Wald confidence limits for the proportion difference. See Newcombe (1998a) and Barker et al. (2001) for more information.

Wilson score confidence limits for $p_1$ and $p_2$ are the roots of

\[ | p_ i - \hat{p}_ i | = z_{\alpha /2} \sqrt { p_ i (1-p_ i)/n_{i \cdot } } \]

for $i = 1, 2$. The confidence limits are computed as

\[ \left( \hat{p}_ i ~ + ~ z_{\alpha /2}^2/2n_{i \cdot } ~ \pm ~ z_{\alpha /2} \sqrt { \left( \hat{p}_ i (1-\hat{p}_ i) + z_{\alpha }^2 / 4n_{i \cdot } \right) / n_{i \cdot } } ~ \right) ~ / ~ \left( 1 + z_{\alpha /2}^2 / n_{i \cdot } \right) \]

For more information, see the section Wilson (Score) Confidence Limits.

Denote the lower and upper Wilson score confidence limits for $p_1$ as $L_1$ and $U_1$, and denote the lower and upper confidence limits for $p_2$ as $L_2$ and $U_2$. The Newcombe confidence limits for the proportion difference ($d = p_1 - p_2$) are computed as

\begin{eqnarray*} d_ L = (\hat{p}_1 - \hat{p}_2) ~ - ~ \sqrt { ( \hat{p}_1 - L_1 )^2 ~ +~ ( U_2 - \hat{p}_2 )^2 } \\[0.10in] d_ U = (\hat{p}_1 - \hat{p}_2) ~ + ~ \sqrt { ( U_1 - \hat{p}_1 )^2 ~ +~ ( \hat{p}_2 - L_2 )^2 } \end{eqnarray*}

If you specify the CORRECT riskdiff-option, PROC FREQ provides continuity-corrected Newcombe confidence limits. By including a continuity correction of $1/2n_{i \cdot }$, the Wilson score confidence limits for the individual proportions are computed as the roots of

\[ | p_ i - \hat{p}_ i | - 1/2n_{i \cdot } = z_{\alpha /2} \sqrt { p_ i (1-p_ i)/n_{i \cdot } } \]

The continuity-corrected confidence limits for the individual proportions are then used to compute the proportion difference confidence limits $d_ L$ and $d_ U$.

Wald Confidence Limits
Wald confidence limits for the risk difference are computed as

\[ \hat{d} ~ \pm ~ ( ~ z_{\alpha /2} \times \mr{se}(\hat{d}) ~ ) \]

where $\hat{d} = \hat{p}_1 - \hat{p}_2$, $z_{\alpha /2}$ is the $100(1-\alpha /2)$th percentile of the standard normal distribution. and the standard error is computed from the sample proportions as

\[ \mr{se}(\hat{d}) = \sqrt { \hat{p}_1 (1-\hat{p}_1) / n_{1 \cdot } ~ +~ \hat{p}_2 (1-\hat{p}_2) / n_{2 \cdot } } \]

If you specify the CORRECT riskdiff-option, the Wald confidence limits include a continuity correction cc,

\[ \hat{d} ~ \pm ~ ( ~ \mathit{cc} ~ + ~ z_{\alpha /2} \times \mr{se}(\hat{d}) ~ ) \]

where $\mathit{cc} = (1/n_{1 \cdot } + 1/n_{2 \cdot })/2$.

The subsection "Wald Test" in the section Noninferiority Tests describes the corresponding noninferiority test.

Exact Unconditional Confidence Limits
If you specify the RISKDIFF option in the EXACT statement, PROC FREQ provides exact unconditional confidence limits for the risk difference. PROC FREQ computes the confidence limits by inverting two separate one-sided tests (tail method), where the size of each test is at most $\alpha /2$ and the confidence coefficient is at least $(1-\alpha $). The conditional exact method, which is described in the section Exact Statistics, does not apply to the risk difference because of a nuisance parameter (Agresti 1992). The unconditional method (which fixes only the row margins) eliminates the nuisance parameter by maximizing the p-value over all possible values of the parameter (Santner and Snell 1980).

By default, PROC FREQ uses the unstandardized risk difference as the test statistic to compute the confidence limits. If you specify the RISKDIFF(METHOD=SCORE) option, the procedure uses the score statistic to compute the confidence limits (Chan and Zhang 1999). The score statistic is a less discrete statistic than the unstandardized risk difference and produces less conservative confidence limits (Agresti and Min 2001). For more information, see Santner et al. (2007). The section Confidence Limits for the Risk Difference describes the computation of the risk difference score statistic. For more information, see Miettinen and Nurminen (1985) and Farrington and Manning (1990).

PROC FREQ computes the exact unconditional confidence limits as follows. The risk difference is defined as the difference between the row 1 and row 2 risks (proportions), $d = p_1 - p_2$, and $n_1$ and $n_2$ denote the row totals of the $2 \times 2$ table. The joint probability function for the table can be expressed in terms of the table cell frequencies, the risk difference, and the nuisance parameter $p_2$ as

\[ f( n_{11}, n_{21}; n_1, n_2, d, p_2 ) = \binom {n_1}{n_{11}} (d + p_2)^{n_{11}} (1-d-p_2)^{n_1-n_{11}} \times \binom {n_2}{n_{21}} p_2^{n_{21}} (1-p_2)^{n_2 - n_{21}} \]

The $100(1-\alpha /2)$% confidence limits for the risk difference are computed as

\begin{eqnarray*} d_ L & = & \sup ~ ( d_\ast : P_ U(d_\ast ) > \alpha /2 ) \\ d_ U & = & \inf ~ ( d_\ast : P_ L(d_\ast ) > \alpha /2 ) \end{eqnarray*}

where

\begin{eqnarray*} P_ U(d_\ast ) & = & \sup _{p_2} ~ \bigl ( \sum _{A, T(a) \geq t_0} f( n_{11}, n_{21}; n_1, n_2, d_\ast , p_2 ) ~ \bigr ) \\[0.10in] P_ L(d_\ast ) & = & \sup _{p_2} ~ \bigl ( \sum _{A, T(a) \leq t_0} f( n_{11}, n_{21}; n_1, n_2, d_\ast , p_2 ) ~ \bigr ) \end{eqnarray*}

The set A includes all $2 \times 2$ tables with row sums equal to $n_1$ and $n_2$, and $T(a)$ denotes the value of the test statistic for table a in A. To compute $P_ U(d_\ast )$, the sum includes probabilities of those tables for which ($T(a) \geq t_0$), where $t_0$ is the value of the test statistic for the observed table. For a fixed value of $d_\ast $, $P_ U(d_\ast )$ is taken to be the maximum sum over all possible values of $p_2$.

Risk Difference Tests

PROC FREQ provides tests of equality, noninferiority, superiority, and equivalence for the risk (proportion) difference. The following analysis methods are available: Wald (with and without continuity correction), Hauck-Anderson, Farrington-Manning (score), and Newcombe (with and without continuity correction). You can specify the method by using the METHOD= riskdiff-option; by default, PROC FREQ provides Wald tests.

Equality Tests

The equality test for the risk difference tests the null hypothesis that the risk difference equals the null value. You can specify a null value by using the EQUAL(NULL=) riskdiff-option; by default, the null value is 0. This test can be expressed as $H_0\colon d = d_0$ versus the alternative $H_ a\colon d \neq d_0$, where $d = p_1 - p_2$ denotes the risk difference (for column 1 or column 2) and $d_0$ denotes the null value.

The test statistic is computed as

\[ z = (\hat{d} - d_0) / \mr{se}(\hat{d}) \]

where the standard error $\mr{se}(\hat{d})$ is computed by using the method that you specify. Available methods for the equality test include Wald (with and without continuity correction), Hauck-Anderson, and Farrington-Manning (score). For a description of the standard error computation, see the subsections "Wald Test," "Hauck-Anderson Test," and "Farrington-Manning (Score) Test," respectively, in the section Noninferiority Tests.

PROC FREQ computes one-sided and two-sided p-values for equality tests. When the test statistic z is greater than 0, PROC FREQ displays the right-sided p-value, which is the probability of a larger value occurring under the null hypothesis. The one-sided p-value can be expressed as

\begin{equation*} P_1 = \begin{cases} \mr{Prob} (Z > z) \quad \mr{if} \hspace{.1in} z > 0 \\ \mr{Prob} (Z < z) \quad \mr{if} \hspace{.1in} z \leq 0 \\ \end{cases}\end{equation*}

where Z has a standard normal distribution. The two-sided p-value is computed as $P_2 = 2 \times P_1$.

Noninferiority Tests

If you specify the NONINF riskdiff-option, PROC FREQ provides a noninferiority test for the risk difference, or the difference between two proportions. The null hypothesis for the noninferiority test is

\[ H_0\colon p_1 - p_2 \leq -\delta \]

versus the alternative

\[ H_ a\colon p_1 - p_2 > -\delta \]

where $\delta $ is the noninferiority margin. Rejection of the null hypothesis indicates that the row 1 risk is not inferior to the row 2 risk. See Chow, Shao, and Wang (2003) for more information.

You can specify the value of $\delta $ with the MARGIN= riskdiff-option. By default, $\delta = 0.2$. You can specify the test method with the METHOD= riskdiff-option. The following methods are available for the risk difference noninferiority analysis: Wald (with and without continuity correction), Hauck-Anderson, Farrington-Manning (score), and Newcombe (with and without continuity correction). The Wald, Hauck-Anderson, and Farrington-Manning methods provide tests and corresponding test-based confidence limits; the Newcombe method provides only confidence limits. If you do not specify METHOD=, PROC FREQ uses the Wald test by default.

The confidence coefficient for the test-based confidence limits is $100(1-2\alpha )$% (Schuirmann 1999). By default, if you do not specify the ALPHA= option, these are 90% confidence limits. You can compare the confidence limits to the noninferiority limit, –$\delta $.

The following sections describe the noninferiority analysis methods for the risk difference.

Wald Test
If you specify the METHOD=WALD riskdiff-option, PROC FREQ provides an asymptotic Wald test of noninferiority for the risk difference. This is also the default method. The Wald test statistic is computed as

\[ z = ( \hat{d} + \delta ) ~ / ~ \mr{se}(\hat{d}) \]

where ($\hat{d} = \hat{p}_1 - \hat{p}_2$) estimates the risk difference and $\delta $ is the noninferiority margin.

By default, the standard error for the Wald test is computed from the sample proportions as

\[ \mr{se}(\hat{d}) = \sqrt { \hat{p}_1 (1 - \hat{p}_1) / n_{1 \cdot } ~ +~ \hat{p}_2 (1 - \hat{p}_2) / n_{2 \cdot } } \]

If you specify the VAR=NULL riskdiff-option, the standard error is based on the null hypothesis that the risk difference equals –$\delta $ (Dunnett and Gent 1977). The standard error is computed as

\[ \mr{se}(\hat{d}) = \sqrt { \tilde{p} (1-\tilde{p})/n_{2 \cdot } ~ +~ (\tilde{p} - \delta ) (1-\tilde{p} + \delta ) / n_{1 \cdot } } \]

where

\[ \tilde{p} = ( n_{11} + n_{21} + \delta n_{1 \cdot } ) / n \]

If you specify the CORRECT riskdiff-option, the test statistic includes a continuity correction. The continuity correction is subtracted from the numerator of the test statistic if the numerator is greater than 0; otherwise, the continuity correction is added to the numerator. The value of the continuity correction is $(1/n_{1 \cdot } + 1/n_{2 \cdot })/2$.

The p-value for the Wald noninferiority test is $P_ z = \mr{Prob} (Z > z)$, where Z has a standard normal distribution.

Hauck-Anderson Test
If you specify the METHOD=HA riskdiff-option, PROC FREQ provides the Hauck-Anderson test for noninferiority. The Hauck-Anderson test statistic is computed as

\[ z = ( \hat{d} + \delta ~ \pm ~ \mathit{cc}) ~ / ~ \mr{se}(\hat{d}) \]

where $\hat{d} = \hat{p}_1 - \hat{p}_2$ and the standard error is computed from the sample proportions as

\[ \mr{se}(\hat{d}) = \sqrt { \hat{p}_1 (1-\hat{p}_1) / (n_{1 \cdot }-1) ~ +~ \hat{p}_2 (1-\hat{p}_2) / (n_{2 \cdot }-1) } \]

The Hauck-Anderson continuity correction cc is computed as

\[ \mathit{cc} = 1 ~ / ~ \bigl ( 2 ~ \min ( n_{1 \cdot }, ~ n_{2 \cdot } ) \bigr ) \]

The p-value for the Hauck-Anderson noninferiority test is $P_ z = \mr{Prob} (Z > z)$, where Z has a standard normal distribution. See Hauck and Anderson (1986) and Schuirmann (1999) for more information.

Farrington-Manning (Score) Test
If you specify the METHOD=FM riskdiff-option, PROC FREQ provides the Farrington-Manning (score) test of noninferiority for the risk difference. A score test statistic for the null hypothesis that the risk difference equals –$\delta $ can be expressed as

\[ z = ( \hat{d} + \delta ) ~ / ~ \mr{se}(\hat{d}) \]

where $\hat{d}$ is the observed value of the risk difference ($\hat{p}_1 - \hat{p}_2$),

\[ \mr{se}(\hat{d}) = \sqrt { \tilde{p}_1 (1-\tilde{p}_1) / n_{1 \cdot } ~ +~ \tilde{p}_2 (1-\tilde{p}_2) / n_{2 \cdot } } \]

and $\tilde{p}_1$ and $\tilde{p}_2$ are the maximum likelihood estimates of the row 1 and row 2 risks (proportions) under the restriction that the risk difference is –$\delta $. The p-value for the noninferiority test is $P_ z = \mr{Prob} (Z > z)$, where Z has a standard normal distribution. For more information, see Miettinen and Nurminen (1985); Miettinen (1985); Farrington and Manning (1990); Dann and Koch (2005).

The maximum likelihood estimates of $p_1$ and $p_1$, subject to the constraint that the risk difference is –$\delta $, are computed as

\[ \tilde{p}_1 = 2 u \cos (w) - b/3a \hspace{.15in} \mr{and} \hspace{.15in} \tilde{p}_2 = \tilde{p}_1 + \delta \]

where

\begin{eqnarray*} w & = & ( \pi + \cos ^{-1}(v / u^3) ) / 3 \\ v & = & b^3 / (3a)^3 - bc/6a^2 + d/2a \\ u & = & \mr{sign}(v) \sqrt {b^2 / (3a)^2 - c/3a} \\ a & = & 1 + \theta \\ b & = & - \left( 1 + \theta + \hat{p}_1 + \theta \hat{p}_2 - \delta (\theta + 2) \right) \\ c & = & \delta ^2 - \delta (2 \hat{p}_1 + \theta + 1) + \hat{p}_1 + \theta \hat{p}_2 \\ d & = & \hat{p}_1 \delta (1 - \delta ) \\ \theta & = & n_{2 \cdot } / n_{1 \cdot } \end{eqnarray*}

For more information, see Farrington and Manning (1990, p. 1453).

Newcombe Noninferiority Analysis
If you specify the METHOD=NEWCOMBE riskdiff-option, PROC FREQ provides a noninferiority analysis that is based on Newcombe hybrid-score confidence limits for the risk difference. The confidence coefficient for the confidence limits is $100(1-2\alpha )$% (Schuirmann 1999). By default, if you do not specify the ALPHA= option, these are 90% confidence limits. You can compare the confidence limits with the noninferiority limit, –$\delta $. If you specify the CORRECT riskdiff-option, the confidence limits includes a continuity correction. See the subsection "Newcombe Confidence Limits" in the section Confidence Limits for the Risk Difference for more information.

Superiority Test

If you specify the SUP riskdiff-option, PROC FREQ provides a superiority test for the risk difference. The null hypothesis is

\[ H_0\colon : p_1 - p_2 \leq \delta \]

versus the alternative

\[ H_ a\colon p_1 - p_2 > \delta \]

where $\delta $ is the superiority margin. Rejection of the null hypothesis indicates that the row 1 proportion is superior to the row 2 proportion. You can specify the value of $\delta $ with the MARGIN= riskdiff-option. By default, $\delta = 0.2$.

The superiority analysis is identical to the noninferiority analysis but uses a positive value of the margin $\delta $ in the null hypothesis. The superiority computations follow those in the section Noninferiority Tests by replacing –$\delta $ by $\delta $. See Chow, Shao, and Wang (2003) for more information.

Equivalence Test

If you specify the EQUIV riskdiff-option, PROC FREQ provides an equivalence test for the risk difference, or the difference between two proportions. The null hypothesis for the equivalence test is

\[ H_0\colon p_1 - p_2 \leq -\delta _{\mi{L}} \hspace{.15in} \mr{or} \hspace{.15in} p_1 - p_2 \geq \delta _{\mi{U}} \]

versus the alternative

\[ H_ a\colon \delta _{\mi{L}} < p_1 - p_2 < \delta _{\mi{U}} \]

where $\delta _{\mi{L}}$ is the lower margin and $\delta _{\mi{U}}$ is the upper margin. Rejection of the null hypothesis indicates that the two binomial proportions are equivalent. See Chow, Shao, and Wang (2003) for more information.

You can specify the value of the margins $\delta _ L$ and $\delta _ U$ with the MARGIN= riskdiff-option. If you do not specify MARGIN=, PROC FREQ uses lower and upper margins of –0.2 and 0.2 by default. If you specify a single margin value $\delta $, PROC FREQ uses lower and upper margins of –$\delta $ and $\delta $. You can specify the test method with the METHOD= riskdiff-option. The following methods are available for the risk difference equivalence analysis: Wald (with and without continuity correction), Hauck-Anderson, Farrington-Manning (score), and Newcombe (with and without continuity correction). The Wald, Hauck-Anderson, and Farrington-Manning methods provide tests and corresponding test-based confidence limits; the Newcombe method provides only confidence limits. If you do not specify METHOD=, PROC FREQ uses the Wald test by default.

PROC FREQ computes two one-sided tests (TOST) for equivalence analysis (Schuirmann 1987). The TOST approach includes a right-sided test for the lower margin $\delta _{\mi{L}}$ and a left-sided test for the upper margin $\delta _{\mi{U}}$. The overall p-value is taken to be the larger of the two p-values from the lower and upper tests.

The section Noninferiority Tests gives details about the Wald, Hauck-Anderson, Farrington-Manning (score), and Newcombe methods for the risk difference. The lower margin equivalence test statistic takes the same form as the noninferiority test statistic but uses the lower margin value $\delta _{\mi{L}}$ in place of –$\delta $. The upper margin equivalence test statistic take the same form as the noninferiority test statistic but uses the upper margin value $\delta _{\mi{U}}$ in place of –$\delta $.

The test-based confidence limits for the risk difference are computed according to the equivalence test method that you select. If you specify METHOD=WALD with VAR=NULL, or METHOD=FM, separate standard errors are computed for the lower and upper margin tests. In this case, the test-based confidence limits are computed by using the maximum of these two standard errors. These confidence limits have a confidence coefficient of $100(1-2\alpha )$% (Schuirmann 1999). By default, if you do not specify the ALPHA= option, these are 90% confidence limits. You can compare the test-based confidence limits to the equivalence limits, $(\delta _{\mi{L}}, \delta _{\mi{U}})$.

Barnard’s Unconditional Exact Test

The BARNARD option in the EXACT statement provides an unconditional exact test for the risk (proportion) difference for $2 \times 2$ tables. The reference set for the unconditional exact test consists of all $2 \times 2$ tables that have the same row sums as the observed table (Barnard 1945, 1947, 1949). This differs from the reference set for exact conditional inference, which is restricted to the set of tables that have the same row sums and the same column sums as the observed table. See the sections Fisher’s Exact Test and Exact Statistics for more information.

The test statistic is the standardized risk difference, which is computed as

\[ T = d / \sqrt { p_{\cdot 1} ( 1 - p_{\cdot 1} ) ( 1/n_1 + 1/n_2 ) } \]

where the risk difference d is defined as the difference between the row 1 and row 2 risks (proportions), $d = ( n_{11} / n_1 - n_{21} / n_2 )$; $n_1$ and $n_2$ are the row 1 and row 2 totals, respectively; and $p_{\cdot 1}$ is the overall proportion in column 1, $(n_{11} + n_{21}) / n$.

Under the null hypothesis that the risk difference is 0, the joint probability function for a table can be expressed in terms of the table cell frequencies, the row totals, and the unknown parameter $\pi $ as

\[ f( n_{11}, n_{21}; n_1, n_2, \pi ) = \binom {n_1}{n_{11}} \binom {n_2}{n_{21}} \pi ^{n_{11} + n_{21}} (1-\pi )^{n - n_{11} - n_{21}} \]

where $\pi $ is the common value of the risk (proportion).

PROC FREQ sums the table probabilities over the reference set for those tables where the test statistic is greater than or equal to the observed value of the test statistic. This sum can be expressed as

\[ \mr{Prob}( \pi ) = \sum _{A, T(a) \geq t_0} f( n_{11}, n_{21}; n_1, n_2, \pi ) \]

where the set A contains all $2 \times 2$ tables with row sums equal to $n_1$ and $n_2$, and $T(a)$ denotes the value of the test statistic for table a in A. The sum includes probabilities of those tables for which ($T(a) \geq t_0$), where $t_0$ is the value of the test statistic for the observed table.

The sum Prob($\pi $) depends on the unknown value of $\pi $. To compute the exact p-value, PROC FREQ eliminates the nuisance parameter $\pi $ by taking the maximum value of Prob($\pi $) over all possible values of $\pi $,

\[ \mr{Prob} = \sup _{ ( 0 \leq \pi \leq 1 ) } { \left( \mr{Prob}( \pi ) \right) } \]

See Suissa and Shuster (1985) and Mehta and Senchaudhuri (2003).