Example 66.5 Conditional Logistic Regression for m:n Matching

Conditional logistic regression is used to investigate the relationship between an outcome and a set of prognostic factors in matched case-control studies. The outcome is whether the subject is a case or a control. If there is only one case and one control, the matching is 1:1. The m:n matching refers to the situation in which there is a varying number of cases and controls in the matched sets. You can perform conditional logistic regression with the PHREG procedure by using the discrete logistic model and forming a stratum for each matched set. In addition, you need to create dummy survival times so that all the cases in a matched set have the same event time value, and the corresponding controls are censored at later times.

Consider the following set of low infant birth-weight data extracted from Appendix 1 of Hosmer and Lemeshow (1989). These data represent 189 women, of whom 59 had low-birth-weight babies and 130 had normal-weight babies. Under investigation are the following risk factors: weight in pounds at the last menstrual period (LWT), presence of hypertension (HT), smoking status during pregnancy (Smoke), and presence of uterine irritability (UI). For HT, Smoke, and UI, a value of 1 indicates a "yes" and a value of 0 indicates a "no." The woman’s age (Age) is used as the matching variable. The SAS data set LBW contains a subset of the data corresponding to women between the ages of 16 and 32.

   data LBW;
      input id Age Low LWT Smoke HT UI @@;
      Time=2-Low;
      datalines;
    25  16   1   130   0  0  0    143  16   0   110   0  0  0
   166  16   0   112   0  0  0    167  16   0   135   1  0  0
   189  16   0   135   1  0  0    206  16   0   170   0  0  0
   216  16   0    95   0  0  0     37  17   1   130   1  0  1
    45  17   1   110   1  0  0     68  17   1   120   1  0  0
    71  17   1   120   0  0  0     83  17   1   142   0  1  0
    93  17   0   103   0  0  0    113  17   0   122   1  0  0
   116  17   0   113   0  0  0    117  17   0   113   0  0  0
   147  17   0   119   0  0  0    148  17   0   119   0  0  0
   180  17   0   120   1  0  0     49  18   1   148   0  0  0
    50  18   1   110   1  0  0     89  18   0   107   1  0  1
   100  18   0   100   1  0  0    101  18   0   100   1  0  0
   132  18   0    90   1  0  1    133  18   0    90   1  0  1
   168  18   0   229   0  0  0    205  18   0   120   1  0  0
   208  18   0   120   0  0  0     23  19   1    91   1  0  1
    33  19   1   102   0  0  0     34  19   1   112   1  0  1
    85  19   0   182   0  0  1     96  19   0    95   0  0  0
    97  19   0   150   0  0  0    124  19   0   138   1  0  0
   129  19   0   189   0  0  0    135  19   0   132   0  0  0
   142  19   0   115   0  0  0    181  19   0   105   0  0  0
   187  19   0   235   1  1  0    192  19   0   147   1  0  0
   193  19   0   147   1  0  0    197  19   0   184   1  1  0
   224  19   0   120   1  0  0     27  20   1   150   1  0  0
    31  20   1   125   0  0  1     40  20   1   120   1  0  0
    44  20   1    80   1  0  1     47  20   1   109   0  0  0
    51  20   1   121   1  0  1     60  20   1   122   1  0  0
    76  20   1   105   0  0  0     87  20   0   105   1  0  0
   104  20   0   120   0  0  1    146  20   0   103   0  0  0
   155  20   0   169   0  0  1    160  20   0   141   0  0  1
   172  20   0   121   1  0  0    177  20   0   127   0  0  0
   201  20   0   120   0  0  0    211  20   0   170   1  0  0
   217  20   0   158   0  0  0     20  21   1   165   1  1  0
    28  21   1   200   0  0  1     30  21   1   103   0  0  0
    52  21   1   100   0  0  0     84  21   1   130   1  1  0
    88  21   0   108   1  0  1     91  21   0   124   0  0  0
   128  21   0   185   1  0  0    131  21   0   160   0  0  0
   144  21   0   110   1  0  1    186  21   0   134   0  0  0
   219  21   0   115   0  0  0     42  22   1   130   1  0  1
    67  22   1   130   1  0  0     92  22   0   118   0  0  0
    98  22   0    95   0  1  0    137  22   0    85   1  0  0
   138  22   0   120   0  1  0    140  22   0   130   1  0  0
   161  22   0   158   0  0  0    162  22   0   112   1  0  0
   174  22   0   131   0  0  0    184  22   0   125   0  0  0
   204  22   0   169   0  0  0    220  22   0   129   0  0  0
    17  23   1    97   0  0  1     59  23   1   187   1  0  0
    63  23   1   120   0  0  0     69  23   1   110   1  0  0
    82  23   1    94   1  0  0    130  23   0   130   0  0  0
   139  23   0   128   0  0  0    149  23   0   119   0  0  0
   164  23   0   115   1  0  0    173  23   0   190   0  0  0
   179  23   0   123   0  0  0    182  23   0   130   0  0  0
   200  23   0   110   0  0  0     18  24   1   128   0  0  0
    19  24   1   132   0  1  0     29  24   1   155   1  0  0
    36  24   1   138   0  0  0     61  24   1   105   1  0  0
   118  24   0    90   1  0  0    136  24   0   115   0  0  0
   150  24   0   110   0  0  0    156  24   0   115   0  0  0
   185  24   0   133   0  0  0    196  24   0   110   0  0  0
   199  24   0   110   0  0  0    225  24   0   116   0  0  0
    13  25   1   105   0  1  0     15  25   1    85   0  0  1
    24  25   1   115   0  0  0     26  25   1    92   1  0  0
    32  25   1    89   0  0  0     46  25   1   105   0  0  0
   103  25   0   118   1  0  0    111  25   0   120   0  0  1
   120  25   0   155   0  0  0    121  25   0   125   0  0  0
   169  25   0   140   0  0  0    188  25   0    95   1  0  1
   202  25   0   241   0  1  0    215  25   0   120   0  0  0
   221  25   0   130   0  0  0     35  26   1   117   1  0  0
    54  26   1    96   0  0  0     75  26   1   154   0  1  0
    77  26   1   190   1  0  0     95  26   0   113   1  0  0
   115  26   0   168   1  0  0    154  26   0   133   1  0  0
   218  26   0   160   0  0  0     16  27   1   150   0  0  0
    43  27   1   130   0  0  1    125  27   0   124   1  0  0
     4  28   1   120   1  0  1     79  28   1    95   1  0  0
   105  28   0   120   1  0  0    109  28   0   120   0  0  0
   112  28   0   167   0  0  0    151  28   0   140   0  0  0
   159  28   0   250   1  0  0    212  28   0   134   0  0  0
   214  28   0   130   0  0  0     10  29   1   130   0  0  1
    94  29   0   123   1  0  0    114  29   0   150   0  0  0
   123  29   0   140   1  0  0    190  29   0   135   0  0  0
   191  29   0   154   0  0  0    209  29   0   130   1  0  0
    65  30   1   142   1  0  0     99  30   0   107   0  0  1
   141  30   0    95   1  0  0    145  30   0   153   0  0  0
   176  30   0   110   0  0  0    195  30   0   137   0  0  0
   203  30   0   112   0  0  0     56  31   1   102   1  0  0
   107  31   0   100   0  0  1    126  31   0   215   1  0  0
   163  31   0   150   1  0  0    222  31   0   120   0  0  0
    22  32   1   105   1  0  0    106  32   0   121   0  0  0
   134  32   0   132   0  0  0    170  32   0   134   1  0  0
   175  32   0   170   0  0  0    207  32   0   186   0  0  0
   ;

The variable Low is used to determine whether the subject is a case (Low=1, low-birth-weight baby) or a control (Low=0, normal-weight baby). The dummy time variable Time takes the value 1 for cases and 2 for controls.

The following statements produce a conditional logistic regression analysis of the data. The variable Time is the response, and Low is the censoring variable. Note that the data set is created so that all the cases have the same event time and the controls have later censored times. The matching variable Age is used in the STRATA statement so that each unique age value defines a stratum. The variables LWT, Smoke, HT, and UI are specified as explanatory variables. The TIES=DISCRETE option requests the discrete logistic model.

proc phreg data=LBW;
   model Time*Low(0)= LWT Smoke HT UI / ties=discrete;
   strata Age;
run;

The procedure displays a summary of the number of event and censored observations for each stratum. These are the number of cases and controls for each matched set shown in Output 66.5.1.

Output 66.5.1 Summary of Number of Case and Controls
The PHREG Procedure

Model Information
Data Set WORK.LBW
Dependent Variable Time
Censoring Variable Low
Censoring Value(s) 0
Ties Handling DISCRETE

Summary of the Number of Event and Censored Values
Stratum Age Total Event Censored Percent
Censored
1 16 7 1 6 85.71
2 17 12 5 7 58.33
3 18 10 2 8 80.00
4 19 16 3 13 81.25
5 20 18 8 10 55.56
6 21 12 5 7 58.33
7 22 13 2 11 84.62
8 23 13 5 8 61.54
9 24 13 5 8 61.54
10 25 15 6 9 60.00
11 26 8 4 4 50.00
12 27 3 2 1 33.33
13 28 9 2 7 77.78
14 29 7 1 6 85.71
15 30 7 1 6 85.71
16 31 5 1 4 80.00
17 32 6 1 5 83.33
Total   174 54 120 68.97

Results of the conditional logistic regression analysis are shown in Output 66.5.2. Based on the Wald test for individual variables, the variables LWT, Smoke, and HT are statistically significant while UI is marginal.

The hazard ratios, computed by exponentiating the parameter estimates, are useful in interpreting the results of the analysis. If the hazard ratio of a prognostic factor is larger than 1, an increment in the factor increases the hazard rate. If the hazard ratio is less than 1, an increment in the factor decreases the hazard rate. Results indicate that women were more likely to have low-birth-weight babies if they were underweight in the last menstrual cycle, were hypertensive, smoked during pregnancy, or suffered uterine irritability.

Output 66.5.2 Conditional Logistic Regression Analysis for the Low-Birth-Weight Study
Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.

Model Fit Statistics
Criterion Without
Covariates
With
Covariates
-2 LOG L 159.069 141.108
AIC 159.069 149.108
SBC 159.069 157.064

Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 17.9613 4 0.0013
Score 17.3152 4 0.0017
Wald 15.5577 4 0.0037

Analysis of Maximum Likelihood Estimates
Parameter DF Parameter
Estimate
Standard
Error
Chi-Square Pr > ChiSq Hazard
Ratio
LWT 1 -0.01498 0.00706 4.5001 0.0339 0.985
Smoke 1 0.80805 0.36797 4.8221 0.0281 2.244
HT 1 1.75143 0.73932 5.6120 0.0178 5.763
UI 1 0.88341 0.48032 3.3827 0.0659 2.419

For matched case-control studies with one case per matched set (1:n matching), the likelihood function for the conditional logistic regression reduces to that of the Cox model for the continuous time scale. For this situation, you can use the default TIES=BRESLOW.