The CHISQ option provides chisquare tests of homogeneity or independence and measures of association that are based on the chisquare statistic. When you specify the CHISQ option in the TABLES statement, PROC FREQ computes the following chisquare tests for each twoway table: Pearson chisquare, likelihood ratio chisquare, and MantelHaenszel chisquare tests. PROC FREQ provides the following measures of association that are based on the Pearson chisquare statistic: phi coefficient, contingency coefficient, and Cramér’s V. For tables, the CHISQ option also provides Fisher’s exact test and the continuityadjusted chisquare statistic. You can request Fisher’s exact test for general tables by specifying the FISHER option in the TABLES or EXACT statement.
If you specify the CHISQ option for oneway tables, PROC FREQ provides a oneway Pearson chisquare goodnessoffit test. If you specify the CHISQ(LRCHISQ) option for oneway tables, PROC FREQ also provides a oneway likelihood ratio chisquare test. The other tests and statistics that the CHISQ option produces are available only for twoway tables.
For twoway tables, the null hypothesis for the chisquare tests is no association between the row variable and the column variable. When the sample size n is large, the test statistics have asymptotic chisquare distributions under the null hypothesis. When the sample size is not large, or when the data set is sparse or heavily tied, exact tests might be more appropriate than asymptotic tests. PROC FREQ provides exact pvalues for the Pearson chisquare, likelihood ratio chisquare, and MantelHaenszel chisquare tests, in addition to Fisher’s exact test. For oneway tables, PROC FREQ provides exact pvalues for the Pearson and likelihood ratio chisquare goodnessoffit tests. You can request these exact tests by specifying the corresponding options in the EXACT statement. See the section Exact Statistics for more information.
The MantelHaenszel chisquare statistic is appropriate only when both variables lie on an ordinal scale. The other chisquare tests and statistics in this section are appropriate for either nominal or ordinal variables. The following sections give the formulas that PROC FREQ uses to compute the chisquare tests and statistics. For more information about these statistics, see Agresti (2007) and Stokes, Davis, and Koch (2012), and the other references cited.
For oneway frequency tables, the CHISQ option in the TABLES statement provides a chisquare goodnessoffit test. Let C denote the number of classes, or levels, in the oneway table. Let denote the frequency of class i (or the number of observations in class i) for . Then PROC FREQ computes the oneway chisquare statistic as

where is the expected frequency for class i under the null hypothesis.
In the test for equal proportions, which is the default for the CHISQ option, the null hypothesis specifies equal proportions of the total sample size for each class. Under this null hypothesis, the expected frequency for each class equals the total sample size divided by the number of classes,

In the test for specified frequencies, which PROC FREQ computes when you input null hypothesis frequencies by using the TESTF= option, the expected frequencies are the TESTF= values that you specify. In the test for specified proportions, which PROC FREQ computes when you input null hypothesis proportions by using the TESTP= option, the expected frequencies are determined from the specified TESTP= proportions as

Under the null hypothesis (of equal proportions, specified frequencies, or specified proportions), has an asymptotic chisquare distribution with C–1 degrees of freedom.
In addition to the asymptotic test, you can request an exact oneway chisquare test by specifying the CHISQ option in the EXACT statement. See the section Exact Statistics for more information.
The Pearson chisquare for twoway tables involves the differences between the observed and expected frequencies, where the expected frequencies are computed under the null hypothesis of independence. The Pearson chisquare statistic is computed as

where is the observed frequency in table cell (i, j) and is the expected frequency for table cell (i, j). The expected frequency is computed under the null hypothesis that the row and column variables are independent,

When the row and column variables are independent, has an asymptotic chisquare distribution with (R–1)(C–1) degrees of freedom. For large values of , this test rejects the null hypothesis in favor of the alternative hypothesis of general association.
In addition to the asymptotic test, you can request an exact Pearson chisquare test by specifying the PCHI or CHISQ option in the EXACT statement. See the section Exact Statistics for more information.
For tables, the Pearson chisquare is also appropriate for testing the equality of two binomial proportions. For and tables, the Pearson chisquare tests the homogeneity of proportions. See Fienberg (1980) for details.
When you specify the CROSSLIST(STDRES) option in the TABLES statement for twoway or multiway tables, PROC FREQ displays the standardized residuals in the CROSSLIST table.
The standardized residual of a crosstabulation table cell is the ratio of (frequency – expected) to its standard error, where frequency is the table cell frequency and expected is the estimated expected cell frequency. The expected frequency is computed under the null hypothesis that the row and column variables are independent. See the section Pearson ChiSquare Test for TwoWay Tables for more information.
PROC FREQ computes the standardized residual of table cell (i, j) as

where is the observed frequency of table cell (i, j), is the expected frequency of the table cell, is the proportion in row i (), and is the proportion in column j (). The expected frequency of table cell (i, j) is computed as

Under the null hypothesis of independence, each standardized residual has an asymptotic standard normal distribution. See section 2.4.5 of Agresti (2007) for more information.
For oneway frequency tables, the CHISQ(LRCHISQ) option in the TABLES statement provides a likelihood ratio chisquare goodnessoffit test. By default, the likelihood ratio test is based on the null hypothesis of equal proportions in the C classes (levels) of the oneway table. If you specify null hypothesis proportions or frequencies by using the CHISQ(TESTP=) or CHISQ(TESTF=) option, respectively, the likelihood ratio test is based on the null hypothesis values that you specify.
PROC FREQ computes the oneway likelihood ratio test as

where is the observed frequency of class i, and is the expected frequency of class i under the null hypothesis.
For the null hypothesis of equal proportions, the expected frequency of each class equals the total sample size divided by the number of classes,

If you provide null hypothesis frequencies by specifying the CHISQ(TESTF=) option in the TABLES statement, the expected frequencies are the TESTF= values that you specify. If you provide null hypothesis proportions by specifying the CHISQ(TESTP=) option in the TABLES statement, PROC FREQ computes the expected frequencies as

where the proportions are the TESTP= values that you specify.
Under the null hypothesis (of equal proportions, specified frequencies, or specified proportions), the likelihood ratio statistic has an asymptotic chisquare distribution with C–1 degrees of freedom.
In addition to the asymptotic test, you can request an exact oneway likelihood ratio chisquare test by specifying the LRCHISQ option in the EXACT statement. See the section Exact Statistics for more information.
The likelihood ratio chisquare involves the ratios between the observed and expected frequencies. The likelihood ratio chisquare statistic is computed as

where is the observed frequency in table cell (i, j) and is the expected frequency for table cell (i, j).
When the row and column variables are independent, has an asymptotic chisquare distribution with (R–1)(C–1) degrees of freedom.
In addition to the asymptotic test, you can request an exact likelihood ratio chisquare test by specifying the LRCHI or CHISQ option in the EXACT statement. See the section Exact Statistics for more information.
The continuityadjusted chisquare for tables is similar to the Pearson chisquare, but it is adjusted for the continuity of the chisquare distribution. The continuityadjusted chisquare is most useful for small sample sizes. The use of the continuity adjustment is somewhat controversial; this chisquare test is more conservative (and more like Fisher’s exact test) when the sample size is small. As the sample size increases, the continuityadjusted chisquare becomes more like the Pearson chisquare.
The continuityadjusted chisquare statistic is computed as

Under the null hypothesis of independence, has an asymptotic chisquare distribution with (R–1)(C–1) degrees of freedom.
The MantelHaenszel chisquare statistic tests the alternative hypothesis that there is a linear association between the row variable and the column variable. Both variables must lie on an ordinal scale. The MantelHaenszel chisquare statistic is computed as

where r is the Pearson correlation between the row variable and the column variable. For a description of the Pearson correlation, see the Pearson Correlation Coefficient. The Pearson correlation and thus the MantelHaenszel chisquare statistic use the scores that you specify in the SCORES= option in the TABLES statement. See Mantel and Haenszel (1959) and Landis, Heyman, and Koch (1978) for more information.
Under the null hypothesis of no association, has an asymptotic chisquare distribution with one degree of freedom.
In addition to the asymptotic test, you can request an exact MantelHaenszel chisquare test by specifying the MHCHI or CHISQ option in the EXACT statement. See the section Exact Statistics for more information.
Fisher’s exact test is another test of association between the row and column variables. This test assumes that the row and column totals are fixed, and then uses the hypergeometric distribution to compute probabilities of possible tables conditional on the observed row and column totals. Fisher’s exact test does not depend on any largesample distribution assumptions, and so it is appropriate even for small sample sizes and for sparse tables.
For tables, PROC FREQ gives the following information for Fisher’s exact test: table probability, twosided pvalue, leftsided pvalue, and rightsided pvalue. The table probability equals the hypergeometric probability of the observed table, and is in fact the value of the test statistic for Fisher’s exact test.
Where p is the hypergeometric probability of a specific table with the observed row and column totals, Fisher’s exact pvalues are computed by summing probabilities p over defined sets of tables,

The twosided pvalue is the sum of all possible table probabilities (conditional on the observed row and column totals) that are less than or equal to the observed table probability. For the twosided pvalue, the set A includes all possible tables with hypergeometric probabilities less than or equal to the probability of the observed table. A small twosided pvalue supports the alternative hypothesis of association between the row and column variables.
For tables, onesided pvalues for Fisher’s exact test are defined in terms of the frequency of the cell in the first row and first column of the table, the (1,1) cell. Denoting the observed (1,1) cell frequency by , the leftsided pvalue for Fisher’s exact test is the probability that the (1,1) cell frequency is less than or equal to . For the leftsided pvalue, the set A includes those tables with a (1,1) cell frequency less than or equal to . A small leftsided pvalue supports the alternative hypothesis that the probability of an observation being in the first cell is actually less than expected under the null hypothesis of independent row and column variables.
Similarly, for a rightsided alternative hypothesis, A is the set of tables where the frequency of the (1,1) cell is greater than or equal to that in the observed table. A small rightsided pvalue supports the alternative that the probability of the first cell is actually greater than that expected under the null hypothesis.
Because the (1,1) cell frequency completely determines the table when the marginal row and column sums are fixed, these onesided alternatives can be stated equivalently in terms of other cell probabilities or ratios of cell probabilities. The leftsided alternative is equivalent to an odds ratio less than 1, where the odds ratio equals (). Additionally, the leftsided alternative is equivalent to the column 1 risk for row 1 being less than the column 1 risk for row 2, . Similarly, the rightsided alternative is equivalent to the column 1 risk for row 1 being greater than the column 1 risk for row 2, . See Agresti (2007) for details.
Fisher’s exact test was extended to general tables by Freeman and Halton (1951), and this test is also known as the FreemanHalton test. For tables, the twosided pvalue definition is the same as for tables. The set A contains all tables with p less than or equal to the probability of the observed table. A small pvalue supports the alternative hypothesis of association between the row and column variables. For tables, Fisher’s exact test is inherently twosided. The alternative hypothesis is defined only in terms of general, and not linear, association. Therefore, Fisher’s exact test does not have rightsided or leftsided pvalues for general tables.
For tables, PROC FREQ computes Fisher’s exact test by using the network algorithm of Mehta and Patel (1983), which provides a faster and more efficient solution than direct enumeration. See the section Exact Statistics for more details.
The phi coefficient is a measure of association derived from the Pearson chisquare. The range of the phi coefficient is for tables. For tables larger than , the range is (Liebetrau, 1983). The phi coefficient is computed as


See Fleiss, Levin, and Paik (2003, pp. 98–99) for more information.
The contingency coefficient is a measure of association derived from the Pearson chisquare. The range of the contingency coefficient is , where (Liebetrau, 1983). The contingency coefficient is computed as

See Kendall and Stuart (1979, pp. 587–588) for more information.
Cramér’s V is a measure of association derived from the Pearson chisquare. It is designed so that the attainable upper bound is always 1. The range of Cramér’s V is for tables; for tables larger than , the range is . Cramér’s V is computed as


See Kendall and Stuart (1979, p. 588) for more information.