The GEE Procedure

Example 43.6 GEE for Nominal Multinomial Data

This example illustrates how you use the GEE procedure to analyze nominal multinomial data. A two-year study was conducted to assess the impact of access to Section 8 housing as a means of providing independent housing to the severely mentally ill homeless (Hurlbut, Wood, and Hough 1996). In this study, half of the 362 clients received Section 8 housing certificates. The assignment of Section 8 housing certificates is recorded in the variable Sec; 0 indicates clients who did not receive a certificate, and 1 indicates clients who received a certificate.

Every six months during the study, research staff interviewed all 362 clients, who provided data about their living arrangements in the previous 60 days. Clients’ living arrangements were also recorded during a baseline interview. The time of interviews is recorded in the variable Time, whose value is 0, 6, 12, or 24 (for the number of months since the study began). There were a total of 159 missed interviews. The variable Housing records the living arrangement of a client and is coded as 0 (street living), 1 (community living), or 2 (independent living). The following statements create the data set Housing:

data Housing;
   input ID Housing Time Sec;
   datalines;
1 1 0 1
1 2 6 1
1 2 12 1
1 2 24 1
2 1 0 1
2 2 6 1

   ... more lines ...   

362 1 0 0
362 1 6 0
362 1 12 0
362 1 24 0
;

The following SAS statements use PROC GEE to fit a model to nominal multinomial data:

proc gee data=Housing;
   class ID Housing Time SEC;
   model Housing=Sec / dist=multinomial link=glogit;
   repeated subject=ID / within=Time;
run;

An ordinary GEE that has an independent working correlation structure is fit. This model is the only option supported for data that have nominal multinomial responses. In the MODEL statement, you specify LINK=GLOGIT to indicate that the responses are nominal. In the generalized logit model, you model baseline category logits. By default, the GEE procedure chooses the last response category as the baseline category. If your nominal response has J categories, then the baseline logit for category j and subject i is

\[ \log (\mu _{ij} / \mu _{iJ})=\eta _{ij}=\mb{x}_{i}’ \bbeta _ j \]

and

\[ \mu _{ij}=\frac{\exp (\eta _{ij})}{\sum _{k=1}^ J \exp (\eta _{ik})} \]
\[ \eta _{iJ}=0 \]

The results of fitting the model are displayed in Output 43.6.1.

Output 43.6.1: Results of Model Fitting

The GEE Procedure

Parameter Estimates for Response Model
with Empirical Standard Error Estimates
Parameter   Housing Estimate Standard
Error
95% Confidence Limits Z Pr > |Z|
Intercept   0 -0.9532 0.1266 -1.2013 -0.7051 -7.53 <.0001
Intercept   1 -0.6562 0.1064 -0.8647 -0.4477 -6.17 <.0001
Sec 0 0 0.9226 0.1850 0.5599 1.2853 4.99 <.0001
Sec 0 1 1.2645 0.1642 0.9426 1.5863 7.70 <.0001
Sec 1 0 0.0000 0.0000 0.0000 0.0000 . .
Sec 1 1 0.0000 0.0000 0.0000 0.0000 . .



The positive estimates for the classification variable Sec = 0 at each response category, Housing = 0 and 1, indicate an increased probability that a client will live independently when given access to Section 8 housing. The model fit criteria are shown in Output 43.6.2

Output 43.6.2: Model Fit Criteria

GEE Fit Criteria
QIC 2675.2174
QICu 2671.4680



For comparison, the following SAS statements treat the responses as ordinal and use PROC GEE to fit a marginal model by using an independent working correlation structure:

proc gee data=Housing;
   class ID Housing Time SEC;
   model Housing=Sec / dist=multinomial;
   repeated subject=ID / within=Time;
run;

The cumulative logit link function is the default option that is used to fit the model. Because the generalized logit link function is not specified, the responses are treated as ordinal multinomial data. The results for the model that is fit by treating the responses as ordinal are displayed in Output 43.6.3.

Output 43.6.3: Results of Model Fitting

The GEE Procedure

Parameter Estimates for Response Model
with Empirical Standard Error Estimates
Parameter   Estimate Standard
Error
95% Confidence Limits Z Pr > |Z|
Intercept1   -1.6917 0.1242 -1.9352 -1.4481 -13.62 <.0001
Intercept2   0.0112 0.0960 -0.1770 0.1994 0.12 0.9072
Sec 0 0.8224 0.1327 0.5624 1.0824 6.20 <.0001
Sec 1 0.0000 0.0000 0.0000 0.0000 . .



Treating the responses as ordinal results in a single parameter estimate that is related to the classification variable Sec. The QIC for the model that is fit by treating the responses as nominal (shown in Output 43.6.2) is 2675.21, whereas the QIC for the model that is fit by treating the responses as ordinal (shown in Output 43.6.4) is 2710.50, indicating a slightly better fit when the responses are treated as nominal.

Output 43.6.4: Model Fit Criteria

GEE Fit Criteria
QIC 2710.4971
QICu 2707.2983