Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The CATMOD Procedure

Example 1.2: Identifying an Inappropriate Model

Suppose you have the following data, and you want to use IPF to fit the "no three-factor effect" model:

   data pathological; 
      input X Y Z count @@;
      datalines;
   1 1 1  0  1 1 2 15  
   1 2 1 15  1 2 2 24  
   2 1 1 17  2 1 2 14  
   2 2 1 16  2 2 2  0  
   ;

For this model, it turns out that n111=n222=0 implies the cell frequency estimates \hat m_{111}=\hat m_{222}=0. This means that the table has only 6 degrees of freedom (non-structural zero cells) available, while the model requires 7 degrees of freedom (one degree for each of the mean, X, Y, Z, XY, XZ, and YZ). Therefore, in order to analyze the data appropriately, these two cells should be dropped from the table and treated as structural zeros, and the model should be reduced. You may be able to identify cases like this with PROC CATMOD by observing convergence problems or by noting that the predicted frequency of a cell seems to be converging to zero:

   proc catmod data=pathological;
      weight count;
      model X*Y*Z=_response_ / ml=ipf zero=sampling;
      loglin X|Y|Z@2;
   run;

Output 1.2.1: ML=IPF with ZERO=SAMPLING
 
WARNING: The IPF algorithm failed to converge.

When the sampling zeros are replaced by structural zeros, the adjusted degrees of freedom for the likelihood ratio are negative; this is another signal that the model is inappropriate for the data:

 
   proc catmod data=pathological;
      weight count;
      model X*Y*Z=_response_ / ml=ipf;
      loglin X|Y|Z@2;
   run;

Output 1.2.2: ML=IPF with Structural Zeros
 
The IPF algorithm converged.
 
Maximum Likelihood Analysis of Variance
Source DF Chi-Square Pr > ChiSq
Likelihood Ratio 0 . .

WARNING: Negative adjusted degrees of freedom were
calculated for the Likelihood Ratio test. The model may be inappropriate.


When using ML=NR, you receive a note about having redundant parameters in the model, and you may get messages about having infinite parameters:

 
   proc catmod data=pathological;
      weight count;
      model X*Y*Z=_response_ / ml=nr;
      loglin X|Y|Z@2;
   run;

Output 1.2.3: ML=NR with Structural Zeros
 
Maximum likelihood computations converged.
 
Maximum Likelihood Analysis of Variance
Source DF Chi-Square Pr > ChiSq
X 1 0.00 0.9931
Y 1 0.73 0.3930
X*Y 1 1.23 0.2682
Z 1 0.32 0.5723
X*Z 1 1.85 0.1739
Y*Z 0* . .
Likelihood Ratio 0 . .

NOTE: Effects marked with '*' contain one or more
redundant or restricted parameters.


This example is discussed further in Bishop, Fienberg, and Holland (1975, p. 115), Agresti (1990, p. 245), and Christensen (1997, p. 292).

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.