PROC CASECONTROL: Example

The CASECONTROL Procedure

Here are some sample SNP data on which the three case-control tests can be performed using PROC CASECONTROL:

   data cc;
      input affected $ m1-m16;
      datalines;
    N  1 1 2 2 2 2 2 1 2 1 2 2 1 1 2 2 
    N  1 1 1 1 2 2 1 1 2 1 2 1 1 1 1 1 
    N  2 1 1 1 2 1 1 1 2 2 1 1 1 1 1 1 
    N  2 2 2 1 2 2 1 1 2 2 2 1 1 1 2 2 
    N  1 1 1 1 2 2 2 1 1 1 1 1 2 1 . . 
    N  2 1 1 1 2 1 1 1 2 1 2 1 1 1 2 1 
    N  1 1 1 1 2 2 1 1 2 2 2 2 2 1 2 2 
    N  2 2 1 1 2 1 2 1 2 2 2 1 1 1 2 1 
    N  2 1 1 1 2 2 2 1 2 1 . . 1 1 2 1 
    N  2 1 1 1 2 1 1 1 2 2 1 1 1 1 1 1 
    N  2 1 2 2 . . 1 1 2 1 1 1 1 1 1 1 
    N  2 2 . . 2 1 1 1 2 1 2 1 1 1 2 1 
    N  2 1 . . 2 2 1 1 2 2 1 1 1 1 2 1 
    N  2 1 . . 2 2 1 1 2 1 . . 2 1 1 1 
    N  2 2 . . 2 2 1 1 . . 2 1 1 1 2 1 
    N  1 1 . . 2 2 1 1 1 1 2 1 1 1 2 1 
    N  1 1 . . 2 2 1 1 1 1 . . 1 1 2 1 
    N  2 1 . . 2 2 1 1 1 1 . . 2 1 2 1 
    A  2 1 2 1 2 1 1 1 1 1 2 1 . . 2 1 
    A  2 1 2 1 2 2 1 1 2 1 1 1 . . 1 1 
    A  2 2 2 1 2 2 1 1 2 2 . . . . 2 1 
    A  2 1 2 2 2 1 1 1 2 1 2 1 . . 2 2 
    A  . . 2 2 2 1 . . 1 1 2 2 . . 2 1 
    A  1 1 1 1 2 1 1 1 2 1 1 1 . . 2 2 
    A  2 1 1 1 2 2 1 1 1 1 2 1 . . 2 1 
    A  2 1 2 2 2 2 1 1 2 2 . . . . 2 2 
    A  2 1 1 1 2 2 1 1 2 1 2 1 . . 1 1 
    A  2 1 2 2 2 1 1 1 2 1 2 1 . . 2 2 
    A  1 1 1 1 2 2 1 1 2 1 2 1 . . 2 2 
    A  2 1 2 1 2 1 1 1 2 1 2 2 . . 2 1 
    A  2 2 2 2 1 1 1 1 2 1 2 1 . . 2 2 
    A  1 1 1 1 2 1 . . 2 1 2 2 . . 2 2 
    A  1 1 2 1 2 1 1 1 2 1 2 1 . . 2 2 
    A  2 2 1 1 2 2 1 1 2 1 1 1 . . 2 1
    ;

The following SAS code can be used to perform the analysis:

   proc casecontrol data=cc prefix=Marker;
      var m1-m16;
      trait affected;
   run;

   proc print heading=h;
    format probgenotype proballele probtrend pvalue5.4;
    format chisqgenotype chisqallele chisqtrend 5.3;
   run;

All three case-control tests are performed by default. The output data set created by default appears in Figure 5.1.

Figure 5.1 Statistics for Case-Control Tests

Obs	Locus	NumTraitA	NumTraitN	ChiSqGenotype	ChiSqAllele	ChiSqTrend	dfGenotype	dfAllele	dfTrend	ProbGenotype	ProbAllele	ProbTrend
1	Marker1	15	18	0.272	0.033	0.032	2	1	1	0.873	0.857	0.858
2	Marker2	16	11	3.430	3.260	2.140	2	1	1	0.180	0.071	0.144
3	Marker3	16	17	2.981	2.569	2.925	2	1	1	0.225	0.109	0.087
4	Marker4	14	18	3.556	3.319	3.556	1	1	1	0.059	0.069	0.059
5	Marker5	16	17	3.004	0.535	0.590	2	1	1	0.223	0.464	0.443
6	Marker6	14	14	0.767	0.650	0.710	2	1	1	0.682	0.420	0.399
7	Marker7	0	18	0.000	0.000	0.000	0	0	0	.	.	.
8	Marker8	16	17	4.132	4.061	3.769	2	1	1	0.127	0.044	0.052

Figure 5.1 displays the statistics for the three tests. The genotype case-control statistic has more degrees of freedom than the other two because it is testing for both dominance genotypic effects and additive allelic effects, while the other statistics are testing for the significant additive effects alone. Using the standard significance level of 0.05, none of the $\text{[math]}$ -values, shown in the last three columns, would be considered significant since they are all above this significance level. Thus, you would conclude that none of the markers show a significant association with the binary trait. The $\text{[math]}$ -values for Marker7 are missing because the genotypes of all the affected individuals are missing at that marker.

Top of Page