![]() | ![]() | ![]() |
There are no multiple comparison methods directly available in PROC FREQ, but there are a number of approaches to consider:
Note that the methods illustrated in this note are appropriate for comparing proportions from indendent proportions, such as from multiple, independent groups of subjects. See this note for comparing dependent proportions, such as the proportions of a multinomial variable observed on a single sample of subjects.
These methods are illustrated in the following example. Note the significant chi-square test that is produced by the CHISQ option in PROC FREQ, indicating lack of independence between TYPE and SITE.
data melanoma;
input type:$13. site:$11. count;
cards;
Hutchinson's Head 22
Hutchinson's Trunk 2
Hutchinson's Extremities 10
Superficial Head 16
Superficial Trunk 54
Superficial Extremities 115
Nodular Head 19
Nodular Trunk 33
Nodular Extremities 73
Indeterminate Head 11
Indeterminate Trunk 17
Indeterminate Extremities 28
;
proc freq data=melanoma;
weight count;
tables type*site / chisq cellchi2 nopercent;
run;
Statistics for Table of type by site
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The CELLCHI2 option displays the contributions of the cells to the overall chi-square statistic. Note that the Hutchinson's row accounts for the vast majority of the chi-square statistic, indicating that this row differs from the others. Similarly, the Head column accounts for much more of the chi-square statistic than the other two columns, indicating it differs from the others. These difference patterns can also be seen by comparing the patterns of row percentages across the rows and comparing the patterns of column percentages across the columns. A correspondence analysis plot makes the differences visually obvious. Note that Hutchinson's and Head are well separated from the other rows and columns, further confirming that this is the nature of the nonindependence within the table.
ods graphics on;
proc corresp data=melanoma;
weight count;
tables type, site;
run;
|
Pairwise comparisons among the types can be done by running PROC FREQ on each pair of types. Include a WHERE statement in each PROC FREQ step to select the desired pair. This appears to confirm that Hutchinson's differs from the other types. Pairwise comparisons among the columns could be done in a similar fashion.
ods output chisq(persist)=chisq(where=(statistic="Chi-Square")
rename=(Prob=Raw_P));
proc freq data=melanoma;
where type in ("Hutchinson's", "Superficial");
weight count;
tables type*site / chisq cellchi2 nopercent;
run;
proc freq data=melanoma;
where type in ("Hutchinson's", "Indeterminate");
weight count;
tables type*site / chisq cellchi2 nopercent;
run;
proc freq data=melanoma;
where type in ("Hutchinson's", "Nodular");
weight count;
tables type*site / chisq cellchi2 nopercent;
run;
proc freq data=melanoma;
where type in ("Indeterminate", "Nodular");
weight count;
tables type*site / chisq cellchi2 nopercent;
run;
proc freq data=melanoma;
where type in ("Indeterminate", "Superficial");
weight count;
tables type*site / chisq cellchi2 nopercent;
run;
proc freq data=melanoma;
where type in ("Nodular", "Superficial");
weight count;
tables type*site / chisq cellchi2 nopercent;
run;
ods output clear;
proc print noobs;
var value raw_p;
run;
|
Adjustment of p-values for the problem of multiple testing can be done in PROC MULTTEST. The ODS OUTPUT statements at the beginning and end of the preceding set of pairwise PROC FREQ steps create a data set called CHISQ that contains the Pearson chi-square results from all of the subtables. The PDATA= option in PROC MULTTEST enables you to read a data set that contains a set of p-values to be adjusted. Since MULTTEST requires that the p-values be in a variable named RAW_P, the RENAME= option is used in the first ODS OUTPUT statement above. Even using the fairly conservative Bonferroni adjustment, the first three tests that compare Hutchinson's with the other types are still significant. The other comparisons show no significant differences.
proc multtest pdata=chisq bon;
run;
| ||||||||||||||||||||||||
A test of Hutchinson's against the other three types merged together can be done by creating and using a format that groups the three types.
proc format;
value $typefmt H-I="Hutchinson's" I-T="Others";
run;
proc freq data=melanoma;
weight count;
tables type*site / chisq cellchi2 nopercent;
format type $typefmt.;
run;
Statistics for Table of type by site
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The same tests of independence as provided by the CHISQ option in PROC FREQ can be obtained by fitting a Poisson loglinear model in PROC GENMOD. The following statements fit the independence model, which includes only the main effects. Nonindependence means that there is some interaction between the row and column variables. The Pearson chi-square statistic that is reported by PROC GENMOD is a test of the interaction and is identical to the chi-square from PROC FREQ run on the full table. Similarly, the Deviance matches the likelihood ratio chi-square from PROC FREQ.
proc genmod data=melanoma;
class type site;
model count=type site / dist=poisson;
run;
| ||||||||||||||||||||||||||||
Adding the TYPE*SITE interaction to the model enables you to test the nature of the nonindependence by using CONTRAST statements. In the following statements, the TYPE3 option provides likelihood ratio tests of the main effects and interaction. Notice that the test of the interaction matches the Deviance in the previous independence model and the likelihood ratio chi-square obtained from PROC FREQ run on the full table.
You can add CONTRAST statements to test the hypothesis that Hutchinson's row differs from the other rows. For the pattern in Hutchinson's row to be the same as in another row, the differences between Extremities and Head and between Extremities and Trunk should be the same as in the other row. The first three CONTRAST statements that follow jointly test these differences between Hutchinson's and each of the other rows. To determine the coefficients that are needed in the CONTRAST statement, it is necessary to write out the form of the model that you are fitting in PROC GENMOD, and then write out the hypothesis to be tested in terms of the model parameters and simplify. This process applies to CONTRAST and ESTIMATE statements in all modeling procedures and is discussed and illustrated in detail in Examples of Writing CONTRAST and ESTIMATE Statements. The likelihood ratio tests of the contrasts match the likelihood ratio chi-squares that are obtained from the previous pairwise runs of PROC FREQ (not shown above) and confirm the difference of Hutchinson's row from each of the others. Contrasts could similarly be constructed to compare the columns.
The final CONTRAST statement is a joint test of the three pairwise contrasts and demonstrates that the pairwise contrasts decompose the interaction. Notice that the test of this final contrast matches the TYPE3 test of the full interaction.
proc genmod data=melanoma;
class type site;
model count=type|site / dist=poisson type3;
contrast "Hut vs Ind" type*site 1 -1 0 -1 1 0,
type*site 1 0 -1 -1 0 1;
contrast "Hut vs Nod" type*site 1 -1 0 0 0 0 -1 1 0,
type*site 1 0 -1 0 0 0 -1 0 1;
contrast "Hut vs Sup" type*site 1 -1 0 0 0 0 0 0 0 -1 1 0,
type*site 1 0 -1 0 0 0 0 0 0 -1 0 1;
contrast "Hut vs All" type*site 1 -1 0 -1 1 0,
type*site 1 0 -1 -1 0 1,
type*site 1 -1 0 0 0 0 -1 1 0,
type*site 1 0 -1 0 0 0 -1 0 1,
type*site 1 -1 0 0 0 0 0 0 0 -1 1 0,
type*site 1 0 -1 0 0 0 0 0 0 -1 0 1;
run;
| Product Family | Product | System | SAS Release | |
| Reported | Fixed* | |||
| SAS System | SAS/STAT | All | n/a | |
| Type: | Usage Note |
| Priority: | low |
| Topic: | SAS Reference ==> Procedures ==> MULTTEST SAS Reference ==> Procedures ==> CORRESP Analytics ==> Psychometrics SAS Reference ==> Procedures ==> FREQ SAS Reference ==> Procedures ==> GENMOD Analytics ==> Categorical Data Analysis Analytics ==> Descriptive Statistics |
| Date Modified: | 2011-03-21 15:28:39 |
| Date Created: | 2002-12-16 10:56:38 |



