The PSMOOTH Procedure

Example

Suppose you want to test the 16 markers represented in the following data for association with a disease by using the genotype case-control and trend tests in PROC CASECONTROL. You are concerned about the multiple hypothesis testing issue, and so you also want to run PROC PSMOOTH on the output data set from PROC CASECONTROL in order to eliminate the number of false positives found using the individual $p$-values from the marker-trait association tests.

data in;
   input affected (m1-m16) ($);
   datalines;
1 1/2 2/2 2/2 2/2 1/1 2/2 1/2 1/2 1/1 1/2 1/2 2/2 2/2 2/2 2/2 1/2
1 1/2 1/1 1/2 1/2 1/1 1/1 1/2 1/1 1/2 1/2 1/1 2/2 1/1 1/2 1/1 1/2
1 1/1 2/2 1/2 1/2 1/1 1/2 1/1 1/2 1/2 2/2 2/2 1/2 1/2 1/2 2/2 1/2
1 1/1 1/2 2/2 1/2 1/2 1/1 1/2 1/2 1/2 1/1 1/1 1/2 2/2 1/2 1/1 1/1
1 1/2 1/1 1/1 1/2 2/2 1/1 1/1 1/2 1/1 2/2 1/2 2/2 2/2 2/2 1/2 1/1
1 1/2 1/1 1/2 2/2 2/2 1/1 1/1 1/2 1/2 1/2 2/2 2/2 1/1 2/2 2/2 1/1

   ... more lines ...   

0 2/2 1/1 1/2 1/1 1/2 1/2 1/2 2/2 1/1 1/2 1/1 1/1 1/1 2/2 1/1 1/2
;

Note that the columns marker1-marker16 contain genotypes at each of the markers, so the GENOCOL option must be used in PROC CASECONTROL as follows to correctly read in the data.

proc casecontrol data=in outstat=cc_tests genotype trend genocol;
   trait affected;
   var m1-m16;
run;

proc psmooth data=cc_tests simes fisher tpm bw=2 adjust=sidak
             out=adj_p;
   var ProbGenotype ProbTrend;
   id Locus;
run;
proc print data=adj_p heading=h;
run;

This code modifies the $p$-values contained in the output data set from PROC CASECONTROL, first by smoothing the $p$-values by using Simes’ method, Fisher’s method, and the TPM with a bandwidth of 2, then by applying Šidák’s multiple testing adjustment to the smoothed $p$-values.

Figure 9.1: PROC PSMOOTH Output Data Set

Obs Locus ProbGenotype ProbGenotype_S2 ProbGenotype_F2 ProbGenotype_T2 ProbTrend ProbTrend_S2 ProbTrend_F2 ProbTrend_T2
1 m1 0.61481 0.84871 0.96719 0.83858 0.32699 1.00000 1.00000 0.91474
2 m2 0.03711 0.92355 0.97753 0.91260 0.84733 1.00000 1.00000 0.96248
3 m3 0.57096 0.96252 0.98449 0.95280 0.57628 1.00000 0.99986 0.98348
4 m4 0.34059 0.96252 0.80318 0.95280 0.23932 1.00000 1.00000 0.98348
5 m5 0.35600 0.99999 0.99858 0.98348 0.16135 0.99998 0.99979 0.98348
6 m6 0.12375 0.99999 0.99861 0.98348 0.85742 0.99981 0.99807 0.98348
7 m7 0.41529 1.00000 0.99962 0.98348 0.29694 0.99994 0.99961 0.98348
8 m8 0.57360 1.00000 0.99997 0.98348 0.33141 0.99994 0.99999 0.98348
9 m9 0.47332 1.00000 1.00000 0.98348 0.36231 0.99925 0.99902 0.98348
10 m10 0.59452 0.05946 0.41423 0.25944 0.31242 0.01520 0.06303 0.11769
11 m11 0.44085 0.05946 0.02931 0.01550 0.35299 0.01520 0.01179 0.01454
12 m12 0.00076 0.05946 0.00036 0.00017 0.00019 0.01345 0.00005 0.00005
13 m13 0.00911 0.05946 0.00052 0.00017 0.03301 0.01345 0.00011 0.00005
14 m14 0.00160 0.05946 0.00008 0.00002 0.00034 0.01345 0.00001 0.00000
15 m15 0.94287 0.09744 0.00570 0.00138 0.86176 0.02144 0.00153 0.00044
16 m16 0.04264 0.07395 0.05720 0.01902 0.01207 0.01612 0.00519 0.00223


Figure 9.1 displays the original and modified $p$-values.