Example 60.4 Fisher Test with Permutation Resampling

The following data, from Brown and Fears (1981), are the results of an 80-week carcinogenesis bioassay with female mice. Six tissue sites are examined at necropsy; 1 indicates the presence of a tumor and 0 the absence. A frequency variable Freq is included. A control and four different doses of a drug (in parts per milliliter) make up the levels of the grouping variable Dose.

data a;
   input Liver Lung Lymph Cardio Pitui Ovary Freq Dose$ @@;
   datalines;
1 0 0 0 0 0 8  CTRL   0 1 0 0 0 0 7  CTRL   0 0 1 0 0 0 6  CTRL
0 0 0 1 0 0 1  CTRL   0 0 0 0 0 1 2  CTRL   1 1 0 0 0 0 4  CTRL
1 0 1 0 0 0 1  CTRL   1 0 0 0 0 1 1  CTRL   0 1 1 0 0 0 1  CTRL
0 0 0 0 0 0 18 CTRL
1 0 0 0 0 0 9  4PPM   0 1 0 0 0 0 4  4PPM   0 0 1 0 0 0 7  4PPM
0 0 0 1 0 0 1  4PPM   0 0 0 0 1 0 2  4PPM   0 0 0 0 0 1 1  4PPM
1 1 0 0 0 0 4  4PPM   1 0 1 0 0 0 3  4PPM   1 0 0 0 1 0 1  4PPM
0 1 1 0 0 0 1  4PPM   0 1 0 1 0 0 1  4PPM   1 0 1 1 0 0 1  4PPM
0 0 0 0 0 0 15 4PPM
1 0 0 0 0 0 8  8PPM   0 1 0 0 0 0 3  8PPM   0 0 1 0 0 0 6  8PPM
0 0 0 1 0 0 3  8PPM   1 1 0 0 0 0 1  8PPM   1 0 1 0 0 0 2  8PPM
1 0 0 1 0 0 1  8PPM   1 0 0 0 1 0 1  8PPM   1 1 0 1 0 0 2  8PPM
1 1 0 0 0 1 2  8PPM   0 0 0 0 0 0 19 8PPM
1 0 0 0 0 0 4  16PPM  0 1 0 0 0 0 2  16PPM  0 0 1 0 0 0 9  16PPM
0 0 0 0 1 0 1  16PPM  0 0 0 0 0 1 1  16PPM  1 1 0 0 0 0 4  16PPM
1 0 1 0 0 0 1  16PPM  0 1 1 0 0 0 1  16PPM  0 1 0 1 0 0 1  16PPM
0 1 0 0 0 1 1  16PPM  0 0 1 1 0 0 1  16PPM  0 0 1 0 1 0 1  16PPM
1 1 1 0 0 0 2  16PPM  0 0 0 0 0 0 14 16PPM  
1 0 0 0 0 0 8  50PPM  0 1 0 0 0 0 4  50PPM  0 0 1 0 0 0 8  50PPM
0 0 0 1 0 0 1  50PPM  0 0 0 0 0 1 4  50PPM  1 1 0 0 0 0 3  50PPM
1 0 1 0 0 0 1  50PPM  0 1 1 0 0 0 1  50PPM  0 1 0 0 1 1 1  50PPM
0 0 0 0 0 0 19 50PPM
;
proc multtest data=a order=data notables out=p 
              permutation nsample=1000 seed=764511;
   test fisher(Liver Lung Lymph Cardio Pitui Ovary /
               lowertailed);
   class Dose;
   freq Freq;
run;
proc print data=p;
run;

In the PROC MULTTEST statement, the ORDER=DATA option is required to keep the levels of Dose in the order in which they appear in the data set. Without this option, the levels are sorted by their formatted value, resulting in an alphabetic ordering. The NOTABLES option suppresses the display of summary statistics, and the OUT= option produces an output data set p containing the p-values. The PERMUTATION option specifies permutation resampling, NSAMPLE=1000 requests 1000 samples, and SEED=764511 option provides a starting value for the random number generator. You should specify a seed if you need to duplicate resampling results.

To test for higher rates of tumor occurrence in the treatment groups compared to the control group, the LOWERTAILED option is specified in the FISHER option of the TEST statement to produce a lower-tailed Fisher exact test for the six tissue sites. The Fisher test is appropriate for comparing a treatment and a control, but multiple testing can be a problem. Brown and Fears (1981) use a multivariate permutation to evaluate the entire collection of tests. PROC MULTTEST adjusts the p-values by simulation.

The treatments make up the levels of the grouping variable Dose, listed in the CLASS statement. Since no CONTRAST statement is specified, PROC MULTTEST uses the default pairwise contrasts with the first level of Dose. The FREQ statement is used since these are summary data containing frequency counts of occurrences.

The results from this analysis are listed in Output 60.4.1 through Output 60.4.4. First, the PROC MULTTEST specifications are displayed in Output 60.4.1.

Output 60.4.1 Fisher Test with Permutation Resampling
The Multtest Procedure

Model Information
Test for discrete variables Fisher
Tails for discrete tests Lower-tailed
Strata weights None
P-value adjustment Permutation
Number of resamples 1000
Seed 764511

The default contrasts for the Fisher test are displayed in Output 60.4.2. Note that each dose is compared with the control.

Output 60.4.2 Default Contrast Coefficients
Contrast Coefficients
Contrast Dose
CTRL 4PPM 8PPM 16PPM 50PPM
CTRL vs. 4PPM 1 -1 0 0 0
CTRL vs. 8PPM 1 0 -1 0 0
CTRL vs. 16PPM 1 0 0 -1 0
CTRL vs. 50PPM 1 0 0 0 -1

The "p-Values" table in Output 60.4.3 displays p-values for the Fisher exact tests and their permutation-based adjustments.

Output 60.4.3 p-Values
p-Values
Variable Contrast Raw Permutation
Liver CTRL vs. 4PPM 0.2828 0.9610
Liver CTRL vs. 8PPM 0.3069 0.9670
Liver CTRL vs. 16PPM 0.7102 1.0000
Liver CTRL vs. 50PPM 0.7718 1.0000
Lung CTRL vs. 4PPM 0.7818 1.0000
Lung CTRL vs. 8PPM 0.8858 1.0000
Lung CTRL vs. 16PPM 0.5469 0.9990
Lung CTRL vs. 50PPM 0.8498 1.0000
Lymph CTRL vs. 4PPM 0.2423 0.9280
Lymph CTRL vs. 8PPM 0.5898 1.0000
Lymph CTRL vs. 16PPM 0.0350 0.2680
Lymph CTRL vs. 50PPM 0.4161 0.9930
Cardio CTRL vs. 4PPM 0.3163 0.9710
Cardio CTRL vs. 8PPM 0.0525 0.3710
Cardio CTRL vs. 16PPM 0.4506 0.9960
Cardio CTRL vs. 50PPM 0.7576 1.0000
Pitui CTRL vs. 4PPM 0.1250 0.7540
Pitui CTRL vs. 8PPM 0.4948 0.9970
Pitui CTRL vs. 16PPM 0.2157 0.9080
Pitui CTRL vs. 50PPM 0.5051 0.9970
Ovary CTRL vs. 4PPM 0.9437 1.0000
Ovary CTRL vs. 8PPM 0.8126 1.0000
Ovary CTRL vs. 16PPM 0.7760 1.0000
Ovary CTRL vs. 50PPM 0.3689 0.9930

As noted by Brown and Fears, only one of the 24 tests is significant at the 5% level (Lymph, CTRL vs. 16PPM). Brown and Fears report a 12% chance of observing at least one significant raw p-value for 16PPM and a 9% chance of observing at least one significant raw p-value for Lymph (both at the 5% level). Adjusted p-values exhibit much lower chances of false significances. For this example, none of the adjusted p-values are close to significant.

The OUT= data set is displayed in Output 60.4.4.

Output 60.4.4 OUT= Data Set
Obs _test_ _var_ _contrast_ _xval_ _mval_ _yval_ _nval_ raw_p perm_p sim_se
1 FISHER Liver CTRL vs. 4PPM 14 49 18 50 0.28282 0.961 0.006122
2 FISHER Liver CTRL vs. 8PPM 14 49 17 48 0.30688 0.967 0.005649
3 FISHER Liver CTRL vs. 16PPM 14 49 11 43 0.71022 1.000 0.000000
4 FISHER Liver CTRL vs. 50PPM 14 49 12 50 0.77175 1.000 0.000000
5 FISHER Lung CTRL vs. 4PPM 12 49 10 50 0.78180 1.000 0.000000
6 FISHER Lung CTRL vs. 8PPM 12 49 8 48 0.88581 1.000 0.000000
7 FISHER Lung CTRL vs. 16PPM 12 49 11 43 0.54685 0.999 0.000999
8 FISHER Lung CTRL vs. 50PPM 12 49 9 50 0.84978 1.000 0.000000
9 FISHER Lymph CTRL vs. 4PPM 8 49 12 50 0.24228 0.928 0.008174
10 FISHER Lymph CTRL vs. 8PPM 8 49 8 48 0.58977 1.000 0.000000
11 FISHER Lymph CTRL vs. 16PPM 8 49 15 43 0.03498 0.268 0.014006
12 FISHER Lymph CTRL vs. 50PPM 8 49 10 50 0.41607 0.993 0.002636
13 FISHER Cardio CTRL vs. 4PPM 1 49 3 50 0.31631 0.971 0.005307
14 FISHER Cardio CTRL vs. 8PPM 1 49 6 48 0.05254 0.371 0.015276
15 FISHER Cardio CTRL vs. 16PPM 1 49 2 43 0.45061 0.996 0.001996
16 FISHER Cardio CTRL vs. 50PPM 1 49 1 50 0.75758 1.000 0.000000
17 FISHER Pitui CTRL vs. 4PPM 0 49 3 50 0.12496 0.754 0.013619
18 FISHER Pitui CTRL vs. 8PPM 0 49 1 48 0.49485 0.997 0.001729
19 FISHER Pitui CTRL vs. 16PPM 0 49 2 43 0.21572 0.908 0.009140
20 FISHER Pitui CTRL vs. 50PPM 0 49 1 50 0.50505 0.997 0.001729
21 FISHER Ovary CTRL vs. 4PPM 3 49 1 50 0.94372 1.000 0.000000
22 FISHER Ovary CTRL vs. 8PPM 3 49 2 48 0.81260 1.000 0.000000
23 FISHER Ovary CTRL vs. 16PPM 3 49 2 43 0.77596 1.000 0.000000
24 FISHER Ovary CTRL vs. 50PPM 3 49 5 50 0.36889 0.993 0.002636

The _test_, _var_, and _contrast_ variables provide the TEST name, TEST variable, and CONTRAST label, respectively. The _xval_, _mval_, _yval_, and _nval_ variables contain the components used to compute the Fisher exact tests from the hypergeometric distribution. The raw_p variable contains the p-values from the Fisher exact tests, and the perm_p variable contains their permutation-based adjustments. The variable sim_se is the simulation standard error from the permutation resampling.