This information is updated from David Schlotzhauer (1996), "Comparing Two Proportions: The Chi-Square Test, Power, and Sample Size," Observations: The Technical Journal for SAS Software Users 5(4): 59-62.
A test for comparing proportions from two independent samples can be performed by using the CHISQ option in the FREQ procedure1. Just think of your data as a 2x2 ("two-by-two") table. For example, suppose you ask 50 men and 50 women if they voted on election day. You find that 12 men voted (or 12/50 = 24%) and 18 women voted (or 18/50 = 36%)a difference of 12%. You want to test whether there is a significant difference in the probabilities of men and women voting in the population from which you sampled. Here is how this data can be arranged in a 2x2 table:
|
|
The following SAS statements created the table above. Note the use of the WEIGHT statement to enter the cell counts of the table. If you have raw data (one observation per person that contains values for GENDER and VOTE) instead of cell counts, then simply omit the WEIGHT statement.
data vote; input gender $ vote $ count; datalines; male yes 12 male no 38 female yes 18 female no 32 ; proc freq; weight count; table gender*vote; run;
The Row Pct values in the vote=yes column are the proportions of men and women who voted in your sample. Pearson's chi-square statistic can be used to test the hypothesis that the corresponding population proportions are equal. This test can be requested with the CHISQ option, which you specify in the TABLE statement of PROC FREQ:
table gender*vote / chisq;
The following table of statistics is produced by the CHISQ option.
Statistic | DF | Value | Prob |
---|---|---|---|
Chi-Square | 1 | 1.7143 | 0.1904 |
Likelihood Ratio Chi-Square | 1 | 1.7230 | 0.1893 |
Continuity Adj. Chi-Square | 1 | 1.1905 | 0.2752 |
Mantel-Haenszel Chi-Square | 1 | 1.6971 | 0.1927 |
Phi Coefficient | -0.1309 | ||
Contingency Coefficient | 0.1298 | ||
Cramer's V | -0.1309 |
The first statistic, labeled Chi-Square, is Pearson's chi-square statistic, which has a value of 1.714 in this table. Large values of the chi-square statistic indicate inequality of the population proportions. The value in the Prob column enables you to assess whether the chi-square value is large. For this table, Prob=0.190 means that the probability of obtaining a chi-square value at least as large as 1.714 when there is really no difference in population proportions is 0.190.
Based on this information, should you reject the hypothesis of equality? Should you accept it? Can a conclusion even be made about the equality of the population proportions? First, consider that there are two types of errors that you can make based on your test. You make a type 1 error by rejecting equality when the population proportions really are equal. The chance of a type 1 error is denoted by alpha (α). It is also called the size or significance level of the test. You make a type 2 error by accepting equality when the population proportions are not equal. The chance of a type 2 error is denoted by beta (β) for a given size difference between the population proportions. Discussion of the basic concepts that underlie statistical hypothesis testing can be found in many statistical theory texts, such as Lindgren (1976).
If you reject equality in the voting example, then your chance of making a type 1 error (α) is .19. If this chance were smallersay, .05 or lessthen most analysts would conclude that observing a value of Pearson's statistic as large as 1.714 must indicate that there really is a difference in population proportions. They would then reject the hypothesis of equality and accept the .05 chance of a type 1 error. However, in this case, the .19 chance indicates that such large statistic values can occur reasonably often when the population proportions are equal. Because you might have merely observed such a case with this data, you cannot conclude that the population proportions differ.
But should you conclude that men and women really vote in equal proportions? There was, after all, a difference of 12% in the observed proportions. Just as you needed to know the chance of a type 1 error when deciding whether to reject equality, you now need to know the probability of making a type 2 error if you accept the hypothesis of equality. The power of this test is the probability of rejecting equality when the population proportions differ by a given amount. Beta, the probability of a type 2 error, is the probability of accepting equality given this amount of difference and is simply 1power.
The %POWER2x2 macro provides power and beta of the Pearson chi-square statistic when computed for 2x2 tables such as this voting example. Some details of the power computations that are used by the macro are given in note 2.
Power depends on sample size, the significance level of the test, and the unknown population proportions. For each of these, supply values at which you are interested in obtaining power. It's a good idea to compute power for several settings close to what you expect the true proportions to be. For this example, assume that the population proportions really differ by the 12% observed. Setting the significance level of the test (chance of a type 1 error) at .05 and both sample sizes at 50 will provide the power of the test that was performed above.
%power2x2(p1=.36, p2=.24, n1=50, n2=50)
This program generates the following output:
Power for comparing two independent proportions |
p1=.36, p2=.24, level=.05 |
Total Sample Size |
Power | Beta |
---|---|---|
100 | 0.25630 | 0.74370 |
Beginning with SAS 9, the power for this test can also be computed by using PROC POWER. The TWOSAMPLEFREQ statement provides power calculations for tests that compare two proportions. The TEST=PCHI option focuses the calculations on the Pearson chi-square test. Calculations for the likelihood ratio chi-square test and Fisher's exact test are also available. The following statements compute the power of the Pearson test for the voting example:
proc power; twosamplefreq test=pchi groupproportions=(.36 .24) nullpdiff=0 npergroup=50 power =.; run;
Fixed Scenario Elements | |
---|---|
Distribution | Asymptotic normal |
Method | Normal approximation |
Null Proportion Difference | 0 |
Group 1 Proportion | 0.36 |
Group 2 Proportion | 0.24 |
Sample Size Per Group | 50 |
Number of Sides | 2 |
Alpha | 0.05 |
Computed Power |
---|
Power |
0.256 |
Based on the test, if the population proportions really differ by 12%, then your chance of incorrectly accepting equality is almost .75. Stated another way, your chance of detecting a 12% difference is only .25. Of course, your chance of detecting larger differences will increase. But for the sample size, significance level, and difference in population proportions that are assumed above, the resulting risk is large and probably unacceptable. It seems the data that you've collected leaves you in a gray zoneyou can neither accept nor reject the hypothesis of equality without incurring an unacceptably large risk of error.
Actually, the problem is that the data provides insufficient evidence to accept or reject equality. If you had gotten the same proportions from samples of 1000 men and 1000 women, then there would not have been this ambiguity. Power is affected by sample sizeas sample size increases, power goes up and beta goes down. You can avoid inconclusive results, following the expense and effort of data collection, by first selecting a sample size that yields adequate power and acceptable beta.
Suppose at the next election you plan to do another study of gender and voting and you would like to pick a sample size to avoid inconclusive results. Assuming equal-sized samples, you'd like to examine power and beta for total sample sizes that range from 10 to 1000. To compute the power for rejecting the null hypothesis of equal probabilities, you must specify the expected voting probabilities for men and women at which to calculate the power and the alpha level that you are willing to accept. Suppose you can tolerate a .05 probability of a type 1 error. From your previous study, a 30% overall voter turnout can be expected. The %POWER2x2 macro enables you to find a sample size that will detect a 12% difference in population proportions with reasonably high probability. Again, because the population proportions are unknown, you might want to try several reasonable settings of the difference.
%power2x2(p1=.36, p2=.24, nmin=10, nmax=1000)
The following output is generated:
Power for comparing two independent proportions |
p1=.36, p2=.24, level=.05 |
Total Sample Size |
Power | Beta |
---|---|---|
10 | 0.06778 | 0.93222 |
50 | 0.15025 | 0.84975 |
100 | 0.25630 | 0.74370 |
150 | 0.35978 | 0.64022 |
200 | 0.45656 | 0.54344 |
250 | 0.54429 | 0.45571 |
300 | 0.62192 | 0.37808 |
350 | 0.68927 | 0.31073 |
400 | 0.74678 | 0.25322 |
450 | 0.79520 | 0.20480 |
500 | 0.83550 | 0.16450 |
550 | 0.86870 | 0.13130 |
600 | 0.89580 | 0.10420 |
650 | 0.91775 | 0.08225 |
700 | 0.93539 | 0.06461 |
750 | 0.94948 | 0.05052 |
800 | 0.96066 | 0.03934 |
850 | 0.96949 | 0.03051 |
900 | 0.97643 | 0.02357 |
950 | 0.98185 | 0.01815 |
1000 | 0.98607 | 0.01393 |
Again, PROC POWER can be used to explore the power for a range of sample sizes:
proc power; twosamplefreq test=pchi groupproportions=(.36 .24) nullpdiff=0 ntotal=10, 50 to 1000 by 50 power =.; run;
Fixed Scenario Elements | |
---|---|
Distribution | Asymptotic normal |
Method | Normal approximation |
Null Proportion Difference | 0 |
Group 1 Proportion | 0.36 |
Group 2 Proportion | 0.24 |
Number of Sides | 2 |
Alpha | 0.05 |
Group 1 Weight | 1 |
Group 2 Weight | 1 |
Computed Power | ||
---|---|---|
Index | N Total | Power |
1 | 10 | 0.068 |
2 | 50 | 0.150 |
3 | 100 | 0.256 |
4 | 150 | 0.360 |
5 | 200 | 0.457 |
6 | 250 | 0.544 |
7 | 300 | 0.622 |
8 | 350 | 0.689 |
9 | 400 | 0.747 |
10 | 450 | 0.795 |
11 | 500 | 0.836 |
12 | 550 | 0.869 |
13 | 600 | 0.896 |
14 | 650 | 0.918 |
15 | 700 | 0.935 |
16 | 750 | 0.949 |
17 | 800 | 0.961 |
18 | 850 | 0.969 |
19 | 900 | 0.976 |
20 | 950 | 0.982 |
21 | 1000 | 0.986 |
Unfortunately, these results show that it will take a lot more data to avoid inconclusive results. If you want a .90 chance of detecting a difference of 12% between the population proportions, then you'll need to take a sample of at least 300 men and 300 women. This would lower the chance of making a type 2 error to .11.
The sample sizes that are shown are estimates based on your guesses of the voting probabilities. As the probabilities become either very small or very large, the variance of the difference in proportions decreases, causing the power for a given sample size to increase. The sample size estimates are also affected by the difference that you want to detect and the type 1 error rate that you choose. If you decide that you want to detect a smaller difference in the population proportions, then more data will be required. If you decide that you are willing to accept only a .01 type 1 error probability, instead of .05, then you'll again need more data. Experiment with different settings and notice the results.
Berry, J. J., and G. I. Hurtado. 1994. "Comparing Non-independent Proportions." Observations: The Technical Journal for SAS Software Users.
Lindgren, B. W. 1976. Statistical Theory. 3d ed. New York: Macmillan Publishing Co.
diff=p1-p2; ph0=(n1*p1+n2*p2)/(n1+n2); stdh0=sqrt(ph0*(1-ph0)*(1/n1+1/n2)); stdha=sqrt(p1*(1-p1)/n1+p2*(1-p2)/n2); power=1-probnorm(-probit(level/2)*stdh0/stdha-diff/stdha) + probnorm( probit(level/2)*stdh0/stdha-diff/stdha); beta=1-power;
Product Family | Product | System | SAS Release | |
Reported | Fixed* | |||
SAS System | SAS/STAT | All | n/a |
Type: | Usage Note |
Priority: | low |
Topic: | SAS Reference ==> Procedures ==> FREQ SAS Reference ==> Procedures ==> POWER Analytics ==> Power and Sample Size Analytics ==> Categorical Data Analysis |
Date Modified: | 2019-05-02 16:40:54 |
Date Created: | 2005-06-09 11:20:48 |