PROC FREQ: Cochran-Mantel-Haenszel Statistics

The FREQ Procedure

Cochran-Mantel-Haenszel Statistics

The CMH option in the TABLES statement gives a stratified statistical analysis of the relationship between the row and column variables after controlling for the strata variables in a multiway table. For example, for the table request A*B*C*D, the CMH option provides an analysis of the relationship between C and D, after controlling for A and B. The stratified analysis provides a way to adjust for the possible confounding effects of A and B without being forced to estimate parameters for them.

The CMH analysis produces Cochran-Mantel-Haenszel statistics, which include the correlation statistic, the ANOVA (row mean scores) statistic, and the general association statistic. For $\text{[math]}$ tables, the CMH option also provides Mantel-Haenszel and logit estimates of the common odds ratio and the common relative risks, as well as the Breslow-Day test for homogeneity of the odds ratios.

Exact statistics are also available for stratified $\text{[math]}$ tables. If you specify the EQOR option in the EXACT statement, PROC FREQ provides Zelen’s exact test for equal odds ratios. If you specify the COMOR option in the EXACT statement, PROC FREQ provides exact confidence limits for the common odds ratio and an exact test that the common odds ratio equals one.

Let the number of strata be denoted by $\text{[math]}$ , indexing the strata by $\text{[math]}$ . Each stratum contains a contingency table with X representing the row variable and Y representing the column variable. For table $\text{[math]}$ , denote the cell frequency in row $\text{[math]}$ and column $\text{[math]}$ by $\text{[math]}$ , with corresponding row and column marginal totals denoted by $\text{[math]}$ and $\text{[math]}$ , and the overall stratum total by $\text{[math]}$ .

Because the formulas for the Cochran-Mantel-Haenszel statistics are more easily defined in terms of matrices, the following notation is used. Vectors are presumed to be column vectors unless they are transposed $\text{[math]}$ .

$\text{[math]}$

Assume that the strata are independent and that the marginal totals of each stratum are fixed. The null hypothesis, $\text{[math]}$ , is that there is no association between X and Y in any of the strata. The corresponding model is the multiple hypergeometric; this implies that, under $\text{[math]}$ , the expected value and covariance matrix of the frequencies are, respectively,

$\text{[math]}$

where

$\text{[math]}$

and where $\text{[math]}$ denotes Kronecker product multiplication and $\text{[math]}$ is a diagonal matrix with the elements of $\text{[math]}$ on the main diagonal.

The generalized CMH statistic (Landis, Heyman, and Koch 1978) is defined as

$\text{[math]}$

where

	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$

and where

$\text{[math]}$

is a matrix of fixed constants based on column scores $\text{[math]}$ and row scores $\text{[math]}$ . When the null hypothesis is true, the CMH statistic has an asymptotic chi-square distribution with degrees of freedom equal to the rank of $\text{[math]}$ . If $\text{[math]}$ is found to be singular, PROC FREQ prints a message and sets the value of the CMH statistic to missing.

PROC FREQ computes three CMH statistics by using this formula for the generalized CMH statistic, with different row and column score definitions for each statistic. The CMH statistics that PROC FREQ computes are the correlation statistic, the ANOVA (row mean scores) statistic, and the general association statistic. These statistics test the null hypothesis of no association against different alternative hypotheses. The following sections describe the computation of these CMH statistics.

Caution:The CMH statistics have low power for detecting an association in which the patterns of association for some of the strata are in the opposite direction of the patterns displayed by other strata. Thus, a nonsignificant CMH statistic suggests either that there is no association or that no pattern of association has enough strength or consistency to dominate any other pattern.

Correlation Statistic

The correlation statistic, popularized by Mantel and Haenszel (1959) and Mantel (1963), has one degree of freedom and is known as the Mantel-Haenszel statistic.

The alternative hypothesis for the correlation statistic is that there is a linear association between X and Y in at least one stratum. If either X or Y does not lie on an ordinal (or interval) scale, then this statistic is not meaningful.

To compute the correlation statistic, PROC FREQ uses the formula for the generalized CMH statistic with the row and column scores determined by the SCORES= option in the TABLES statement. See the section Scores for more information about the available score types. The matrix of row scores $\text{[math]}$ has dimension $\text{[math]}$ , and the matrix of column scores $\text{[math]}$ has dimension $\text{[math]}$ .

When there is only one stratum, this CMH statistic reduces to $\text{[math]}$ , where $\text{[math]}$ is the Pearson correlation coefficient between $\text{[math]}$ and $\text{[math]}$ . When nonparametric (RANK or RIDIT) scores are specified, the statistic reduces to $\text{[math]}$ , where $\text{[math]}$ is the Spearman rank correlation coefficient between X and Y. When there is more than one stratum, this CMH statistic becomes a stratum-adjusted correlation statistic.

ANOVA (Row Mean Scores) Statistic

The ANOVA statistic can be used only when the column variable Y lies on an ordinal (or interval) scale so that the mean score of Y is meaningful. For the ANOVA statistic, the mean score is computed for each row of the table, and the alternative hypothesis is that, for at least one stratum, the mean scores of the $\text{[math]}$ rows are unequal. In other words, the statistic is sensitive to location differences among the $\text{[math]}$ distributions of Y.

The matrix of column scores $\text{[math]}$ has dimension $\text{[math]}$ , and the column scores are determined by the SCORES= option.

The matrix of row scores $\text{[math]}$ has dimension $\text{[math]}$ and is created internally by PROC FREQ as

$\text{[math]}$

where $\text{[math]}$ is an identity matrix of rank $\text{[math]}$ and $\text{[math]}$ is an $\text{[math]}$ vector of ones. This matrix has the effect of forming $\text{[math]}$ independent contrasts of the $\text{[math]}$ mean scores.

When there is only one stratum, this CMH statistic is essentially an analysis of variance (ANOVA) statistic in the sense that it is a function of the variance ratio $\text{[math]}$ statistic that would be obtained from a one-way ANOVA on the dependent variable Y. If nonparametric scores are specified in this case, then the ANOVA statistic is a Kruskal-Wallis test.

If there is more than one stratum, then this CMH statistic corresponds to a stratum-adjusted ANOVA or Kruskal-Wallis test. In the special case where there is one subject per row and one subject per column in the contingency table of each stratum, this CMH statistic is identical to Friedman’s chi-square. See Example 3.9 for an illustration.

General Association Statistic

The alternative hypothesis for the general association statistic is that, for at least one stratum, there is some kind of association between X and Y. This statistic is always interpretable because it does not require an ordinal scale for either X or Y.

For the general association statistic, the matrix $\text{[math]}$ is the same as the one used for the ANOVA statistic. The matrix $\text{[math]}$ is defined similarly as

$\text{[math]}$

PROC FREQ generates both score matrices internally. When there is only one stratum, then the general association CMH statistic reduces to $\text{[math]}$ , where $\text{[math]}$ is the Pearson chi-square statistic. When there is more than one stratum, then the CMH statistic becomes a stratum-adjusted Pearson chi-square statistic. Note that a similar adjustment can be made by summing the Pearson chi-squares across the strata. However, the latter statistic requires a large sample size in each stratum to support the resulting chi-square distribution with $\text{[math]}$ degrees of freedom. The CMH statistic requires only a large overall sample size because it has only $\text{[math]}$ degrees of freedom.

See Cochran (1954); Mantel and Haenszel (1959); Mantel (1963); Birch (1965); and Landis, Heyman, and Koch (1978).

Mantel-Fleiss Criterion

If you specify the MF option in parentheses following the CMH option in the TABLES statement, PROC FREQ computes the Mantel-Fleiss criterion for stratified $\text{[math]}$ tables. The Mantel-Fleiss criterion can be used to assess the validity of the chi-square approximation for the distribution of the Mantel-Haenszel statistic for $\text{[math]}$ tables. See Mantel and Fleiss (1980), Mantel and Haenszel (1959), Stokes, Davis, and Koch (2000), and Dimitrienko et al. (2005) for details.

The Mantel-Fleiss criterion is computed as

$\text{[math]}$

where $\text{[math]}$ is the expected value of $\text{[math]}$ under the hypothesis of no association between the row and column variables in table $\text{[math]}$ , $\text{[math]}$ is the minimum possible value of the table cell frequency, and $\text{[math]}$ is the maximum possible value,

$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
$\text{[math]}$	$\text{[math]}$	$\text{[math]}$

The Mantel-Fleiss guideline accepts the validity of the Mantel-Haenszel approximation when the value of the criterion is at least 5. When the criterion is less than 5, PROC FREQ displays a warning.

Adjusted Odds Ratio and Relative Risk Estimates

The CMH option provides adjusted odds ratio and relative risk estimates for stratified $\text{[math]}$ tables. For each of these measures, PROC FREQ computes a Mantel-Haenszel estimate and a logit estimate. These estimates apply to n-way table requests in the TABLES statement, when the row and column variables both have two levels.

For example, for the table request A*B*C*D, if the row and column variables C and D both have two levels, PROC FREQ provides odds ratio and relative risk estimates, adjusting for the confounding variables A and B.

The choice of an appropriate measure depends on the study design. For case-control (retrospective) studies, the odds ratio is appropriate. For cohort (prospective) or cross-sectional studies, the relative risk is appropriate. See the section Odds Ratio and Relative Risks for 2 x 2 Tables for more information on these measures.

Throughout this section, $\text{[math]}$ denotes the $\text{[math]}$ th percentile of the standard normal distribution.

Odds Ratio, Case-Control Studies

PROC FREQ provides Mantel-Haenszel and logit estimates for the common odds ratio for stratified $\text{[math]}$ tables.

The Mantel-Haenszel estimate of the common odds ratio is computed as

$\text{[math]}$

It is always computed unless the denominator is zero. See Mantel and Haenszel (1959) and Agresti (2002) for details.

To compute confidence limits for the common odds ratio, PROC FREQ uses the Greenland and Robins (1985) variance estimate for $\text{[math]}$ . The $\text{[math]}$ confidence limits for the common odds ratio are

$\text{[math]}$

where

$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
$\text{[math]}$	$\text{[math]}$	$\text{[math]}$

Note that the Mantel-Haenszel odds ratio estimator is less sensitive to small $\text{[math]}$ than the logit estimator.

The adjusted logit estimate of the common odds ratio (Woolf 1955) is computed as

$\text{[math]}$

and the corresponding $\text{[math]}$ % confidence limits are

$\text{[math]}$

where $\text{[math]}$ is the odds ratio for stratum $\text{[math]}$ , and

$\text{[math]}$

If any table cell frequency in a stratum $\text{[math]}$ is zero, PROC FREQ adds $\text{[math]}$ to each cell of the stratum before computing $\text{[math]}$ and $\text{[math]}$ (Haldane 1955) for the logit estimate. The procedure prints a warning when this occurs.

Relative Risks, Cohort Studies

PROC FREQ provides Mantel-Haenszel and logit estimates of the common relative risks for stratified $\text{[math]}$ tables.

The Mantel-Haenszel estimate of the common relative risk for column 1 is computed as

$\text{[math]}$

It is always computed unless the denominator is zero. See Mantel and Haenszel (1959) and Agresti (2002) for more information.

To compute confidence limits for the common relative risk, PROC FREQ uses the Greenland and Robins (1985) variance estimate for $\text{[math]}$ . The $\text{[math]}$ confidence limits for the common relative risk are

$\text{[math]}$

where

$\text{[math]}$

The adjusted logit estimate of the common relative risk for column 1 is computed as

$\text{[math]}$

and the corresponding $\text{[math]}$ % confidence limits are

$\text{[math]}$

where $\text{[math]}$ is the column 1 relative risk estimate for stratum $\text{[math]}$ and

$\text{[math]}$

If $\text{[math]}$ or $\text{[math]}$ is zero, then PROC FREQ adds $\text{[math]}$ to each cell of the stratum before computing $\text{[math]}$ and $\text{[math]}$ for the logit estimate. The procedure prints a warning when this occurs. See Kleinbaum, Kupper, and Morgenstern (1982, Sections 17.4 and 17.5) for details.

Breslow-Day Test for Homogeneity of the Odds Ratios

When you specify the CMH option, PROC FREQ computes the Breslow-Day test for stratified $\text{[math]}$ tables. It tests the null hypothesis that the odds ratios for the $\text{[math]}$ strata are equal. When the null hypothesis is true, the statistic has approximately a chi-square distribution with $\text{[math]}$ degrees of freedom. See Breslow and Day (1980) and Agresti (2007) for more information.

The Breslow-Day statistic is computed as

$\text{[math]}$

where $\text{[math]}$ and $\text{[math]}$ denote expected value and variance, respectively. The summation does not include any table with a zero row or column. If $\text{[math]}$ equals zero or if it is undefined, then PROC FREQ does not compute the statistic and prints a warning message.

For the Breslow-Day test to be valid, the sample size should be relatively large in each stratum, and at least 80% of the expected cell counts should be greater than 5. Note that this is a stricter sample size requirement than the requirement for the Cochran-Mantel-Haenszel test for $\text{[math]}$ tables, in that each stratum sample size (not just the overall sample size) must be relatively large. Even when the Breslow-Day test is valid, it might not be very powerful against certain alternatives, as discussed in Breslow and Day (1980).

If you specify the BDT option, PROC FREQ computes the Breslow-Day test with Tarone’s adjustment, which subtracts an adjustment factor from $\text{[math]}$ to make the resulting statistic asymptotically chi-square. The Breslow-Day-Tarone statistic is computed as

$\text{[math]}$

See Tarone (1985), Jones et al. (1989), and Breslow (1996) for more information.

Zelen’s Exact Test for Equal Odds Ratios

If you specify the EQOR option in the EXACT statement, PROC FREQ computes Zelen’s exact test for equal odds ratios for stratified $\text{[math]}$ tables. Zelen’s test is an exact counterpart to the Breslow-Day asymptotic test for equal odds ratios. The reference set for Zelen’s test includes all possible $\text{[math]}$ tables with the same row, column, and stratum totals as the observed multiway table and with the same sum of cell $\text{[math]}$ frequencies as the observed table. The test statistic is the probability of the observed $\text{[math]}$ table conditional on the fixed margins, which is a product of hypergeometric probabilities.

The p-value for Zelen’s test is the sum of all table probabilities that are less than or equal to the observed table probability, where the sum is computed over all tables in the reference set determined by the fixed margins and the observed sum of cell $\text{[math]}$ frequencies. This test is similar to Fisher’s exact test for two-way tables. See Zelen (1971), Hirji (2006), and Agresti (1992) for more information. PROC FREQ computes Zelen’s exact test by using the polynomial multiplication algorithm of Hirji et al. (1996).

Exact Confidence Limits for the Common Odds Ratio

If you specify the COMOR option in the EXACT statement, PROC FREQ computes exact confidence limits for the common odds ratio for stratified $\text{[math]}$ tables. This computation assumes that the odds ratio is constant over all the $\text{[math]}$ tables. Exact confidence limits are constructed from the distribution of $\text{[math]}$ , conditional on the marginal totals of the $\text{[math]}$ tables.

Because this is a discrete problem, the confidence coefficient for these exact confidence limits is not exactly ( $\text{[math]}$ ) but is at least ( $\text{[math]}$ ). Thus, these confidence limits are conservative. See Agresti (1992) for more information.

PROC FREQ computes exact confidence limits for the common odds ratio by using an algorithm based on Vollset, Hirji, and Elashoff (1991). See also Mehta, Patel, and Gray (1985).

Conditional on the marginal totals of $\text{[math]}$ table $\text{[math]}$ , let the random variable $\text{[math]}$ denote the frequency of table cell $\text{[math]}$ . Given the row totals $\text{[math]}$ and $\text{[math]}$ and column totals $\text{[math]}$ and $\text{[math]}$ , the lower and upper bounds for $\text{[math]}$ are $\text{[math]}$ and $\text{[math]}$ ,

	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$

Let $\text{[math]}$ denote the hypergeometric coefficient,

$\text{[math]}$

and let $\text{[math]}$ denote the common odds ratio. Then the conditional distribution of $\text{[math]}$ is

$\text{[math]}$

Summing over all the $\text{[math]}$ tables, $\text{[math]}$ , and the lower and upper bounds of $\text{[math]}$ are $\text{[math]}$ and $\text{[math]}$ ,

$\text{[math]}$

The conditional distribution of the sum $\text{[math]}$ is

$\text{[math]}$

where

$\text{[math]}$

Let $\text{[math]}$ denote the observed sum of cell (1,1) frequencies over the $\text{[math]}$ tables. The following two equations are solved iteratively for lower and upper confidence limits for the common odds ratio, $\text{[math]}$ and $\text{[math]}$ :

	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$

When the observed sum $\text{[math]}$ equals the lower bound $\text{[math]}$ , PROC FREQ sets the lower confidence limit to zero and determines the upper limit with level $\text{[math]}$ . Similarly, when the observed sum $\text{[math]}$ equals the upper bound $\text{[math]}$ , PROC FREQ sets the upper confidence limit to infinity and determines the lower limit with level $\text{[math]}$ .

When you specify the COMOR option in the EXACT statement, PROC FREQ also computes the exact test that the common odds ratio equals one. Setting $\text{[math]}$ , the conditional distribution of the sum $\text{[math]}$ under the null hypothesis becomes

$\text{[math]}$

The point probability for this exact test is the probability of the observed sum $\text{[math]}$ under the null hypothesis, conditional on the marginals of the stratified $\text{[math]}$ tables, and is denoted by $\text{[math]}$ . The expected value of $\text{[math]}$ under the null hypothesis is

$\text{[math]}$

The one-sided exact p-value is computed from the conditional distribution as $\text{[math]}$ or $\text{[math]}$ , depending on whether the observed sum $\text{[math]}$ is greater or less than $\text{[math]}$ ,

	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$
	$\text{[math]}$	$\text{[math]}$	$\text{[math]}$

PROC FREQ computes two-sided p-values for this test according to three different definitions. A two-sided p-value is computed as twice the one-sided p-value, setting the result equal to one if it exceeds one,

$\text{[math]}$

Additionally, a two-sided p-value is computed as the sum of all probabilities less than or equal to the point probability of the observed sum $\text{[math]}$ , summing over all possible values of $\text{[math]}$ , $\text{[math]}$ ,

$\text{[math]}$

Also, a two-sided p-value is computed as the sum of the one-sided p-value and the corresponding area in the opposite tail of the distribution, equidistant from the expected value,

$\text{[math]}$

Top of Page