23127 - Estimating the odds ratio for matched pairs data with binary response

SUPPORT / SAMPLES & SAS NOTES

Support

Usage Note 23127: Estimating the odds ratio for matched pairs data with binary response

For matched pairs data with a binary response (such as yes/no responses from husband and wife pairs), the AGREE option in PROC FREQ provides a test of equal probability of a yes response. This is McNemar's test of marginal homogeneity. However, as discussed by Fleiss (2003), an estimator other than the usual odds ratio estimator should be used for matched pairs data. An estimate of the odds ratio for matched pairs data can be obtained by using the CMH option with a stratified table specification in PROC FREQ, or using the STRATA statement in PROC LOGISTIC.

To estimate the difference in probabilities (risk difference) with matched pairs data, rather than the odds ratio, see this note. To estimate the risk difference between independent groups, rather than in matched pairs, see this note.

Example

Using the retrospective study example presented by Fleiss (2003) with matched case-control pairs, the following statements compute McNemar's test (without continuity correction), using the AGREE option, and the usual odds ratio estimate for unmatched data, using the RELRISK option:

    data a;
        do case = 'present','absent';
            do control = 'present','absent';
                input count @@;
                output;
            end;
        end;
    datalines;
    15 20
     5 60
    ;

    proc freq order=data;
        weight count;
        table case * control / agree relrisk;
    run;

McNemar's test statistic is significant at p=0.0027. However, the odds ratio estimate (9.0) from the RELRISK option does not account for the data being matched.

present

15.00

42.86

75.00

20.00

57.14

25.00

35.00

absent

5.00

7.69

25.00

60.00

92.31

75.00

65.00

20.00

80.00

100

100.00

Estimates of the Relative Risk (Row1/Row2)
Type of Study	Value	95% Confidence Limits
Case-Control (Odds Ratio)	9.0000	2.9027	27.9051
Cohort (Col1 Risk)	5.5714	2.2094	14.0497
Cohort (Col2 Risk)	0.6190	0.4607	0.8318

McNemar's Test
Statistic (S)	9.0000
DF	1
Pr > S	0.0027

The steps below compute the appropriate odds ratio estimate and confidence interval for matched pairs data. The data must first be arranged in a stratified layout in which a variable identifies the pairs (strata) and another variable identifies the subject in each pair. In the following statements, an observation is created for each subject in each pair: RESPONSE='case' indicates the subject is a case; RESPONSE='control' indicates the subject is a control. The FACTOR variable indicates whether the predictive factor is present or absent. The stratifying variable, ID, has a unique value for each pair so that the members in each pair have the same value of ID.

    data indiv;
       set a;
       retain id 0;
       do id=id+1 to id+count;
         factor=case; response='case'; output;
         factor=control; response='control'; output;
       end;
       keep id factor response;
       run;

These statements display the observations for the first four pairs which were all from the present/present factor combination.

     proc print data=indiv (obs=8);
       id id;
       run;

The odds ratio can be computed via stratified analyses in the FREQ or LOGISTIC procedure. In PROC FREQ, specify a three-way table with the pair identifier, ID, as a stratifying variable and specify the CMH option. Note that in the TABLE statement of PROC FREQ, the last (rightmost) variable in a table specification is the column variable, the next variable to the left is the row variable, and all variables to the left of the row variable are stratifying variables. The NOPRINT option suppresses printing of the separate tables for the pairs (100 tables in this case). In PROC LOGISTIC, the STRATA statement specifies the stratification variable(s) and requests the appropriate conditional logistic model. PROC LOGISTIC provides point and confidence interval estimates of the odds ratio. The optional EXACT statement can be used to provide an exact conditional estimate and confidence interval of the odds ratio.

    proc freq order=data;
        table id*factor*response / cmh noprint;
        run;

    proc logistic;
        strata id;
        class factor (ref='absent') / param=ref;
        model response(event='case') = factor;
        exact factor / estimate=odds;
    run;

The correct estimate of the odds ratio from this matched pairs data is 4.0 which is provided by the Mantel-Haenszel estimate from the CMH option in PROC FREQ and by the asymptotic and exact odds ratio estimates from PROC LOGISTIC. FREQ and LOGISTIC provide a 95% asymptotic confidence interval for the odds ratio is (1.5, 10.7). PROC LOGISTIC also provides an exact 95% confidence interval (1.46, 13.64).

Estimates of the Common Relative Risk (Row1/Row2)
Type of Study	Method	Value	95% Confidence Limits
Case-Control	Mantel-Haenszel	4.0000	1.5013	10.6576
(Odds Ratio)	Logit **	3.7372	1.5114	9.2406
Cohort	Mantel-Haenszel	4.0000	1.5013	10.6576
(Col1 Risk)	Logit **	1.9332	1.1654	3.2067
Cohort	Mantel-Haenszel	0.2500	0.0938	0.6661
(Col2 Risk)	Logit **	0.5173	0.3119	0.8580

Odds Ratio Estimates
Effect	Point Estimate	95% Wald Confidence Limits
factor present vs absent	4.000	1.501	10.658

Exact Odds Ratios
Parameter		Estimate	95% Confidence Limits		Two-sided p-Value
factor	present	4.000	1.457	13.639	0.0041

______

Fleiss, J. L., Levin, B., and Paik, M. C. (2003), Statistical Methods for Rates and Proportions, 3d ed. New York: John Wiley & Sons, Inc.

Operating System and Release Information

Product Family	Product	System	SAS Release
			Reported	Fixed*
SAS System	SAS/STAT	All	n/a

* For software releases that are not yet generally available, the Fixed Release is the software release in which the problem is planned to be fixed.

Type:	Usage Note
Priority:	low
Topic:	SAS Reference ==> Procedures ==> FREQ SAS Reference ==> Procedures ==> LOGISTIC Analytics ==> Categorical Data Analysis Analytics ==> Exact Methods

Date Modified:	2005-01-04 13:02:48
Date Created:	2002-12-16 10:56:39

Support

Usage Note 23127: Estimating the odds ratio for matched pairs data with binary response

Example

Operating System and Release Information

Follow Us

What is...