The following data, from Brown and Fears (1981), are the results of an 80-week carcinogenesis bioassay with female mice. Six tissue sites are examined at necropsy; 1 indicates the presence of a tumor and 0 the absence. A frequency variable Freq is included. A control and four different doses of a drug (in parts per milliliter) make up the levels of the grouping variable Dose.
data a; input Liver Lung Lymph Cardio Pitui Ovary Freq Dose$ @@; datalines; 1 0 0 0 0 0 8 CTRL 0 1 0 0 0 0 7 CTRL 0 0 1 0 0 0 6 CTRL 0 0 0 1 0 0 1 CTRL 0 0 0 0 0 1 2 CTRL 1 1 0 0 0 0 4 CTRL 1 0 1 0 0 0 1 CTRL 1 0 0 0 0 1 1 CTRL 0 1 1 0 0 0 1 CTRL 0 0 0 0 0 0 18 CTRL 1 0 0 0 0 0 9 4PPM 0 1 0 0 0 0 4 4PPM 0 0 1 0 0 0 7 4PPM 0 0 0 1 0 0 1 4PPM 0 0 0 0 1 0 2 4PPM 0 0 0 0 0 1 1 4PPM 1 1 0 0 0 0 4 4PPM 1 0 1 0 0 0 3 4PPM 1 0 0 0 1 0 1 4PPM 0 1 1 0 0 0 1 4PPM 0 1 0 1 0 0 1 4PPM 1 0 1 1 0 0 1 4PPM 0 0 0 0 0 0 15 4PPM 1 0 0 0 0 0 8 8PPM 0 1 0 0 0 0 3 8PPM 0 0 1 0 0 0 6 8PPM 0 0 0 1 0 0 3 8PPM 1 1 0 0 0 0 1 8PPM 1 0 1 0 0 0 2 8PPM 1 0 0 1 0 0 1 8PPM 1 0 0 0 1 0 1 8PPM 1 1 0 1 0 0 2 8PPM 1 1 0 0 0 1 2 8PPM 0 0 0 0 0 0 19 8PPM 1 0 0 0 0 0 4 16PPM 0 1 0 0 0 0 2 16PPM 0 0 1 0 0 0 9 16PPM 0 0 0 0 1 0 1 16PPM 0 0 0 0 0 1 1 16PPM 1 1 0 0 0 0 4 16PPM 1 0 1 0 0 0 1 16PPM 0 1 1 0 0 0 1 16PPM 0 1 0 1 0 0 1 16PPM 0 1 0 0 0 1 1 16PPM 0 0 1 1 0 0 1 16PPM 0 0 1 0 1 0 1 16PPM 1 1 1 0 0 0 2 16PPM 0 0 0 0 0 0 14 16PPM 1 0 0 0 0 0 8 50PPM 0 1 0 0 0 0 4 50PPM 0 0 1 0 0 0 8 50PPM 0 0 0 1 0 0 1 50PPM 0 0 0 0 0 1 4 50PPM 1 1 0 0 0 0 3 50PPM 1 0 1 0 0 0 1 50PPM 0 1 1 0 0 0 1 50PPM 0 1 0 0 1 1 1 50PPM 0 0 0 0 0 0 19 50PPM ;
proc multtest data=a order=data notables out=p permutation nsample=1000 seed=764511; test fisher(Liver Lung Lymph Cardio Pitui Ovary / lowertailed); class Dose; freq Freq; run; proc print data=p; run;
In the PROC MULTTEST statement, the ORDER=DATA option is required to keep the levels of Dose in the order in which they appear in the data set. Without this option, the levels are sorted by their formatted value, resulting in an alphabetic ordering. The NOTABLES option suppresses the display of summary statistics, and the OUT= option produces an output data set p containing the p-values. The PERMUTATION option specifies permutation resampling, NSAMPLE=1000 requests 1000 samples, and SEED=764511 option provides a starting value for the random number generator. You should specify a seed if you need to duplicate resampling results.
To test for higher rates of tumor occurrence in the treatment groups compared to the control group, the LOWERTAILED option is specified in the FISHER option of the TEST statement to produce a lower-tailed Fisher exact test for the six tissue sites. The Fisher test is appropriate for comparing a treatment and a control, but multiple testing can be a problem. Brown and Fears (1981) use a multivariate permutation to evaluate the entire collection of tests. PROC MULTTEST adjusts the p-values by simulation.
The treatments make up the levels of the grouping variable Dose, listed in the CLASS statement. Since no CONTRAST statement is specified, PROC MULTTEST uses the default pairwise contrasts with the first level of Dose. The FREQ statement is used since these are summary data containing frequency counts of occurrences.
The results from this analysis are listed in Output 60.4.1 through Output 60.4.4. First, the PROC MULTTEST specifications are displayed in Output 60.4.1.
Model Information | |
---|---|
Test for discrete variables | Fisher |
Tails for discrete tests | Lower-tailed |
Strata weights | None |
P-value adjustment | Permutation |
Number of resamples | 1000 |
Seed | 764511 |
The default contrasts for the Fisher test are displayed in Output 60.4.2. Note that each dose is compared with the control.
Contrast Coefficients | |||||
---|---|---|---|---|---|
Contrast | Dose | ||||
CTRL | 4PPM | 8PPM | 16PPM | 50PPM | |
CTRL vs. 4PPM | 1 | -1 | 0 | 0 | 0 |
CTRL vs. 8PPM | 1 | 0 | -1 | 0 | 0 |
CTRL vs. 16PPM | 1 | 0 | 0 | -1 | 0 |
CTRL vs. 50PPM | 1 | 0 | 0 | 0 | -1 |
The "p-Values" table in Output 60.4.3 displays p-values for the Fisher exact tests and their permutation-based adjustments.
p-Values | |||
---|---|---|---|
Variable | Contrast | Raw | Permutation |
Liver | CTRL vs. 4PPM | 0.2828 | 0.9610 |
Liver | CTRL vs. 8PPM | 0.3069 | 0.9670 |
Liver | CTRL vs. 16PPM | 0.7102 | 1.0000 |
Liver | CTRL vs. 50PPM | 0.7718 | 1.0000 |
Lung | CTRL vs. 4PPM | 0.7818 | 1.0000 |
Lung | CTRL vs. 8PPM | 0.8858 | 1.0000 |
Lung | CTRL vs. 16PPM | 0.5469 | 0.9990 |
Lung | CTRL vs. 50PPM | 0.8498 | 1.0000 |
Lymph | CTRL vs. 4PPM | 0.2423 | 0.9280 |
Lymph | CTRL vs. 8PPM | 0.5898 | 1.0000 |
Lymph | CTRL vs. 16PPM | 0.0350 | 0.2680 |
Lymph | CTRL vs. 50PPM | 0.4161 | 0.9930 |
Cardio | CTRL vs. 4PPM | 0.3163 | 0.9710 |
Cardio | CTRL vs. 8PPM | 0.0525 | 0.3710 |
Cardio | CTRL vs. 16PPM | 0.4506 | 0.9960 |
Cardio | CTRL vs. 50PPM | 0.7576 | 1.0000 |
Pitui | CTRL vs. 4PPM | 0.1250 | 0.7540 |
Pitui | CTRL vs. 8PPM | 0.4948 | 0.9970 |
Pitui | CTRL vs. 16PPM | 0.2157 | 0.9080 |
Pitui | CTRL vs. 50PPM | 0.5051 | 0.9970 |
Ovary | CTRL vs. 4PPM | 0.9437 | 1.0000 |
Ovary | CTRL vs. 8PPM | 0.8126 | 1.0000 |
Ovary | CTRL vs. 16PPM | 0.7760 | 1.0000 |
Ovary | CTRL vs. 50PPM | 0.3689 | 0.9930 |
As noted by Brown and Fears, only one of the 24 tests is significant at the 5% level (Lymph, CTRL vs. 16PPM). Brown and Fears report a 12% chance of observing at least one significant raw p-value for 16PPM and a 9% chance of observing at least one significant raw p-value for Lymph (both at the 5% level). Adjusted p-values exhibit much lower chances of false significances. For this example, none of the adjusted p-values are close to significant.
The OUT= data set is displayed in Output 60.4.4.
Obs | _test_ | _var_ | _contrast_ | _xval_ | _mval_ | _yval_ | _nval_ | raw_p | perm_p | sim_se |
---|---|---|---|---|---|---|---|---|---|---|
1 | FISHER | Liver | CTRL vs. 4PPM | 14 | 49 | 18 | 50 | 0.28282 | 0.961 | 0.006122 |
2 | FISHER | Liver | CTRL vs. 8PPM | 14 | 49 | 17 | 48 | 0.30688 | 0.967 | 0.005649 |
3 | FISHER | Liver | CTRL vs. 16PPM | 14 | 49 | 11 | 43 | 0.71022 | 1.000 | 0.000000 |
4 | FISHER | Liver | CTRL vs. 50PPM | 14 | 49 | 12 | 50 | 0.77175 | 1.000 | 0.000000 |
5 | FISHER | Lung | CTRL vs. 4PPM | 12 | 49 | 10 | 50 | 0.78180 | 1.000 | 0.000000 |
6 | FISHER | Lung | CTRL vs. 8PPM | 12 | 49 | 8 | 48 | 0.88581 | 1.000 | 0.000000 |
7 | FISHER | Lung | CTRL vs. 16PPM | 12 | 49 | 11 | 43 | 0.54685 | 0.999 | 0.000999 |
8 | FISHER | Lung | CTRL vs. 50PPM | 12 | 49 | 9 | 50 | 0.84978 | 1.000 | 0.000000 |
9 | FISHER | Lymph | CTRL vs. 4PPM | 8 | 49 | 12 | 50 | 0.24228 | 0.928 | 0.008174 |
10 | FISHER | Lymph | CTRL vs. 8PPM | 8 | 49 | 8 | 48 | 0.58977 | 1.000 | 0.000000 |
11 | FISHER | Lymph | CTRL vs. 16PPM | 8 | 49 | 15 | 43 | 0.03498 | 0.268 | 0.014006 |
12 | FISHER | Lymph | CTRL vs. 50PPM | 8 | 49 | 10 | 50 | 0.41607 | 0.993 | 0.002636 |
13 | FISHER | Cardio | CTRL vs. 4PPM | 1 | 49 | 3 | 50 | 0.31631 | 0.971 | 0.005307 |
14 | FISHER | Cardio | CTRL vs. 8PPM | 1 | 49 | 6 | 48 | 0.05254 | 0.371 | 0.015276 |
15 | FISHER | Cardio | CTRL vs. 16PPM | 1 | 49 | 2 | 43 | 0.45061 | 0.996 | 0.001996 |
16 | FISHER | Cardio | CTRL vs. 50PPM | 1 | 49 | 1 | 50 | 0.75758 | 1.000 | 0.000000 |
17 | FISHER | Pitui | CTRL vs. 4PPM | 0 | 49 | 3 | 50 | 0.12496 | 0.754 | 0.013619 |
18 | FISHER | Pitui | CTRL vs. 8PPM | 0 | 49 | 1 | 48 | 0.49485 | 0.997 | 0.001729 |
19 | FISHER | Pitui | CTRL vs. 16PPM | 0 | 49 | 2 | 43 | 0.21572 | 0.908 | 0.009140 |
20 | FISHER | Pitui | CTRL vs. 50PPM | 0 | 49 | 1 | 50 | 0.50505 | 0.997 | 0.001729 |
21 | FISHER | Ovary | CTRL vs. 4PPM | 3 | 49 | 1 | 50 | 0.94372 | 1.000 | 0.000000 |
22 | FISHER | Ovary | CTRL vs. 8PPM | 3 | 49 | 2 | 48 | 0.81260 | 1.000 | 0.000000 |
23 | FISHER | Ovary | CTRL vs. 16PPM | 3 | 49 | 2 | 43 | 0.77596 | 1.000 | 0.000000 |
24 | FISHER | Ovary | CTRL vs. 50PPM | 3 | 49 | 5 | 50 | 0.36889 | 0.993 | 0.002636 |
The _test_, _var_, and _contrast_ variables provide the TEST name, TEST variable, and CONTRAST label, respectively. The _xval_, _mval_, _yval_, and _nval_ variables contain the components used to compute the Fisher exact tests from the hypergeometric distribution. The raw_p variable contains the p-values from the Fisher exact tests, and the perm_p variable contains their permutation-based adjustments. The variable sim_se is the simulation standard error from the permutation resampling.