Example 1.1: Log-Linear Independence Model with
Structural and Sampling Zeros
This example illustrates a log-linear model of independence, using
data that contain structural zero frequencies as well as sampling
(random) zero frequencies.
In a population of six squirrel monkeys, the joint distribution of
genital display with respect to active or passive role was observed.
The data are from Fienberg (1980, Table 8-2). The following DATA
step creates the SAS data set Display:
title 'Behavior of Squirrel Monkeys';
data Display;
input Active $ Passive $ wt @@;
datalines;
r r . r s 1 r t 5 r u 8 r v 9 r w 0
s r 29 s s . s t 14 s u 46 s v 4 s w 0
t r 0 t s 0 t t . t u 0 t v 0 t w 0
u r 2 u s 3 u t 1 u u . u v 38 u w 2
v r 0 v s 0 v t 0 v u 0 v v . v w 1
w r 9 w s 25 w t 4 w u 6 w v 13 w w .
;
In this data set, since a monkey cannot have both active and passive
roles in an interaction, the values on the diagonal are structural
zeros. Any off-diagonal zeros are sampling zeros. Since there are
two types of zeros in this data set, missing values are placed on
the diagonal to represent the structural zeros.
Suppose you're interested in studying the independence of the active
and passive roles. Since the diagonal cells are structural zeros,
you are actually fitting a quasi-independence model; refer to
Agresti (1990) for more information. Since monkey `t' never takes
the active role, the frequencies predicted by an independence model
for these cells are zero; these cells are removed from the analysis
with the WHERE clause.
The following statements produce the analysis that treats the
missing values on the diagonals as structural zeros (since the
MISSING=STRUCTURAL option is the default for one population).
The ZERO=SAMPLING option treats the remaining zeros as sampling
zeros.
proc catmod data=Display;
weight wt;
where Active ^= 't';
model Active*Passive=_response_
/ ml=ipf(parm) zero=sampling;
loglin Active Passive;
run;
Output 1.1.1: Data Summary and Population Profile
| Behavior of Squirrel Monkeys |
| Data Summary |
| Response |
Active*Passive |
Response Levels |
25 |
| Weight Variable |
wt |
Populations |
1 |
| Data Set |
DISPLAY |
Total Frequency |
220 |
| Frequency Missing |
0 |
Observations |
25 |
| Population Profiles |
| Sample |
Sample Size |
| 1 |
220 |
|
The response profiles, shown in Output 1.1.2, include the
off-diagonal zero cells because of the ZERO=SAMPLING option.
Output 1.1.2: Response Profiles
| Response Profiles |
| Response |
Active |
Passive |
| 1 |
r |
s |
| 2 |
r |
t |
| 3 |
r |
u |
| 4 |
r |
v |
| 5 |
r |
w |
| 6 |
s |
r |
| 7 |
s |
t |
| 8 |
s |
u |
| 9 |
s |
v |
| 10 |
s |
w |
| 11 |
u |
r |
| 12 |
u |
s |
| 13 |
u |
t |
| 14 |
u |
v |
| 15 |
u |
w |
| 16 |
v |
r |
| 17 |
v |
s |
| 18 |
v |
t |
| 19 |
v |
u |
| 20 |
v |
w |
| 21 |
w |
r |
| 22 |
w |
s |
| 23 |
w |
t |
| 24 |
w |
u |
| 25 |
w |
v |
|
Because the PARM option is specified, a weighted least squares
analysis is performed on the IPF fitted data and the _Response_
Matrix is displayed (Output 1.1.3); this table can
be suppressed with the NORESPONSE option.
Output 1.1.3: _Response_ Matrix
| _Response_ Matrix |
| |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
| 1 |
1 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
| 2 |
1 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
| 3 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
| 4 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
| 5 |
1 |
0 |
0 |
0 |
-1 |
-1 |
-1 |
-1 |
-1 |
| 6 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
| 7 |
0 |
1 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
| 8 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
| 9 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
| 10 |
0 |
1 |
0 |
0 |
-1 |
-1 |
-1 |
-1 |
-1 |
| 11 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
0 |
0 |
| 12 |
0 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
0 |
| 13 |
0 |
0 |
1 |
0 |
0 |
0 |
1 |
0 |
0 |
| 14 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
1 |
| 15 |
0 |
0 |
1 |
0 |
-1 |
-1 |
-1 |
-1 |
-1 |
| 16 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
0 |
| 17 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
0 |
| 18 |
0 |
0 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
| 19 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
1 |
0 |
| 20 |
0 |
0 |
0 |
1 |
-1 |
-1 |
-1 |
-1 |
-1 |
| 21 |
-1 |
-1 |
-1 |
-1 |
1 |
0 |
0 |
0 |
0 |
| 22 |
-1 |
-1 |
-1 |
-1 |
0 |
1 |
0 |
0 |
0 |
| 23 |
-1 |
-1 |
-1 |
-1 |
0 |
0 |
1 |
0 |
0 |
| 24 |
-1 |
-1 |
-1 |
-1 |
0 |
0 |
0 |
1 |
0 |
| 25 |
-1 |
-1 |
-1 |
-1 |
0 |
0 |
0 |
0 |
1 |
|
The iteration history displays the value of the log likelihood and
the convergence criterion for the IPF method as discussed in
the "Computational Formulas" section.
Output 1.1.4: Iteration History
| Maximum Likelihood Analysis |
| Iteration |
-2 Log Likelihood |
Convergence Criterion |
| 0 |
1201.5105 |
1.0000 |
| 1 |
1198.5669 |
0.002450 |
| 2 |
1198.5604 |
5.4468E-6 |
| 3 |
1198.5603 |
7.702E-8 |
| 4 |
1198.5603 |
1.6932E-9 |
| The IPF algorithm converged. |
|
The "Response Functions and Design Matrix" table
(Output 1.1.5) is displayed when the PARM option is specified;
this table can be suppressed with the NODESIGN option. The logits
are computed from the IPF fitted values rather than the original
data.
Output 1.1.5: Response Functions, Design Matrix
| Response Functions and Design Matrix |
| Sample |
Function Number |
Response Function |
Design Matrix |
| 1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
| 1 |
1 |
-0.97354 |
2 |
1 |
1 |
1 |
0 |
1 |
0 |
0 |
-1 |
| |
2 |
-1.72504 |
2 |
1 |
1 |
1 |
0 |
0 |
1 |
0 |
-1 |
| |
3 |
-0.52752 |
2 |
1 |
1 |
1 |
0 |
0 |
0 |
1 |
-1 |
| |
4 |
-0.73927 |
2 |
1 |
1 |
1 |
0 |
0 |
0 |
0 |
0 |
| |
5 |
-3.56052 |
2 |
1 |
1 |
1 |
-1 |
-1 |
-1 |
-1 |
-2 |
| |
6 |
0.32061 |
1 |
2 |
1 |
1 |
1 |
0 |
0 |
0 |
-1 |
| |
7 |
-0.29932 |
1 |
2 |
1 |
1 |
0 |
0 |
1 |
0 |
-1 |
| |
8 |
0.89820 |
1 |
2 |
1 |
1 |
0 |
0 |
0 |
1 |
-1 |
| |
9 |
0.68645 |
1 |
2 |
1 |
1 |
0 |
0 |
0 |
0 |
0 |
| |
10 |
-2.13480 |
1 |
2 |
1 |
1 |
-1 |
-1 |
-1 |
-1 |
-2 |
| |
11 |
-0.24152 |
1 |
1 |
2 |
1 |
1 |
0 |
0 |
0 |
-1 |
| |
12 |
-0.10995 |
1 |
1 |
2 |
1 |
0 |
1 |
0 |
0 |
-1 |
| |
13 |
-0.86145 |
1 |
1 |
2 |
1 |
0 |
0 |
1 |
0 |
-1 |
| |
14 |
0.12432 |
1 |
1 |
2 |
1 |
0 |
0 |
0 |
0 |
0 |
| |
15 |
-2.69693 |
1 |
1 |
2 |
1 |
-1 |
-1 |
-1 |
-1 |
-2 |
| |
16 |
-4.14787 |
1 |
1 |
1 |
2 |
1 |
0 |
0 |
0 |
-1 |
| |
17 |
-4.01631 |
1 |
1 |
1 |
2 |
0 |
1 |
0 |
0 |
-1 |
| |
18 |
-4.76780 |
1 |
1 |
1 |
2 |
0 |
0 |
1 |
0 |
-1 |
| |
19 |
-3.57029 |
1 |
1 |
1 |
2 |
0 |
0 |
0 |
1 |
-1 |
| |
20 |
-6.60328 |
1 |
1 |
1 |
2 |
-1 |
-1 |
-1 |
-1 |
-2 |
| |
21 |
-0.36584 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
-1 |
| |
22 |
-0.23427 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
-1 |
| |
23 |
-0.98577 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
-1 |
| |
24 |
0.21175 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
-1 |
|
The ANOVA table and the parameter estimates are a by-product of
running WLS on the IPF-fitted values. Note that the likelihood
ratio chi-square (goodness-of-fit G2) in the ANOVA table is
computed from the IPF routine; however, the degrees of freedom for
G2 are calculated through WLS. If the PARM option was not
specified, then only the likelihood ratio test would be displayed.
Output 1.1.6: ANOVA
| Maximum Likelihood Analysis of Variance |
| Source |
DF |
Chi-Square |
Pr > ChiSq |
| Active |
4 |
56.57 |
<.0001 |
| Passive |
5 |
47.94 |
<.0001 |
| Likelihood Ratio |
15 |
135.17 |
<.0001 |
|
Output 1.1.7: Parameter Estimates
| Analysis of Maximum Likelihood Estimates |
| Effect |
Parameter |
Estimate |
Standard Error |
Chi- Square |
Pr > ChiSq |
| Active |
1 |
0.00284 |
0.2660 |
0.00 |
0.9915 |
| |
2 |
1.4286 |
0.2277 |
39.35 |
<.0001 |
| |
3 |
0.8664 |
0.2428 |
12.73 |
0.0004 |
| |
4 |
-3.0399 |
0.8031 |
14.33 |
0.0002 |
| Passive |
5 |
0.3334 |
0.1739 |
3.67 |
0.0552 |
| |
6 |
0.4650 |
0.1990 |
5.46 |
0.0195 |
| |
7 |
-0.2865 |
0.2019 |
2.01 |
0.1558 |
| |
8 |
0.9110 |
0.1615 |
31.81 |
<.0001 |
| |
9 |
0.6992 |
0.1530 |
20.88 |
<.0001 |
|
Since the PARM option is specified, the predicted response functions
are computed from the WLS fit (this table is not shown here).
For the IPF method, the "Maximum Likelihood Predicted Values
for Frequencies" table is displayed by default; however, the predicted
standard errors are not computed unless the PARM option is
specified. The predicted standard errors are computed through WLS.
Output 1.1.8: Predicted Frequencies
| Maximum Likelihood Predicted Values for Frequencies |
| Active |
Passive |
Observed |
Predicted |
Residual |
| Frequency |
Standard Error |
Frequency |
Standard Error |
| r |
s |
1 |
0.997725 |
5.259562 |
1.361573 |
-4.25956 |
| r |
t |
5 |
2.210512 |
2.48072 |
0.691065 |
2.51928 |
| r |
u |
8 |
2.776525 |
8.21586 |
1.855129 |
-0.21586 |
| r |
v |
9 |
2.937996 |
6.648033 |
1.509317 |
2.351967 |
| r |
w |
0 |
0 |
0.395767 |
0.240267 |
-0.39577 |
| s |
r |
29 |
5.017696 |
19.18631 |
3.147955 |
9.813693 |
| s |
t |
14 |
3.620648 |
10.32189 |
2.16963 |
3.678112 |
| s |
u |
46 |
6.031734 |
34.18491 |
4.428728 |
11.81509 |
| s |
v |
4 |
1.981735 |
27.66143 |
3.722828 |
-23.6614 |
| s |
w |
0 |
0 |
1.646726 |
0.952727 |
-1.64673 |
| u |
r |
2 |
1.407771 |
10.93611 |
2.12318 |
-8.93611 |
| u |
s |
3 |
1.720201 |
12.47391 |
2.554314 |
-9.47391 |
| u |
t |
1 |
0.997725 |
5.88343 |
1.380627 |
-4.88343 |
| u |
v |
38 |
5.606814 |
15.76689 |
2.684647 |
22.23311 |
| u |
w |
2 |
1.407771 |
0.938627 |
0.551631 |
1.061373 |
| v |
r |
0 |
0 |
0.219965 |
0.22182 |
-0.21997 |
| v |
s |
0 |
0 |
0.250896 |
0.253756 |
-0.2509 |
| v |
t |
0 |
0 |
0.118337 |
0.120336 |
-0.11834 |
| v |
u |
0 |
0 |
0.39192 |
0.393325 |
-0.39192 |
| v |
w |
1 |
0.997725 |
0.018879 |
0.021731 |
0.981121 |
| w |
r |
9 |
2.937996 |
9.657617 |
1.808652 |
-0.65762 |
| w |
s |
25 |
4.707344 |
11.01564 |
2.275041 |
13.98436 |
| w |
t |
4 |
1.981735 |
5.195624 |
1.18445 |
-1.19562 |
| w |
u |
6 |
2.415857 |
17.20731 |
2.772074 |
-11.2073 |
| w |
v |
13 |
3.497402 |
13.92365 |
2.241575 |
-0.92365 |
|
The model of independence does not fit since the likelihood
ratio test for the interaction is significant. In other
words, active and passive behaviors of the squirrel monkeys
are dependent behavior roles.
Results from using the ML=NR option instead of the ML=IPF option are
very similar, since these are just two different algorithms for
maximum likelihood estimation. Due to the sampling zeros in the
table, use of the WLS method is not recommended.
Copyright © 2000 by SAS Institute Inc., Cary, NC, USA. All rights reserved.