Example 26.21 Testing Equality of Covariance and Mean Matrices between Independent Groups
To make the specification of some standard MSTRUCT models for covariance and mean patterns more efficient, PROC CALIS defines these standard models internally. You can use two options to invoke these built-in covariance and mean patterns easily. For example, with the COVPATTERN= option, you can define the compound symmetry (COMPSYM) pattern for the covariance matrix or the equality of covariance matrices between groups (EQCOVMAT). With the MEANPATTERN= option, you can define uniform means (UNIFORM) for the mean vector or the equality of mean vectors between groups (EQMEANVEC). See the COVPATTERN= and the MEANPATTERN= options for details about the supported built-in covariance and mean patterns.
In Example 26.20, you test of the equality of covariance matrices between two groups. This example extends the application to the test of equality of mean vectors between three independent groups by using the COVPATTERN= and MEANPATTERN= options together. The "best" fit model for the data is explored. The following DATA steps define the covariance and mean matrices for the three independent groups, respectively:
data g1(type=corr);
Input _type_ $ 1-8 _name_ $ 9-11 x1-x9;
datalines;
corr x1 1. . . . . . . . .
corr x2 .721 1. . . . . . . .
corr x3 .676 .379 1. . . . . . .
corr x4 .149 .403 .450 1. . . . . .
corr x5 .422 .384 .445 .411 1. . . . .
corr x6 .343 .456 .243 .308 .531 1. . . .
corr x7 .115 .225 .201 .481 .373 .198 1. . .
corr x8 .213 .237 .434 .503 .267 .333 .355 1. .
corr x9 .236 .257 .159 .246 .126 .235 .601 .512 1.
mean . 21.3 22.3 17.2 23.4 22.1 15.6 18.7 20.1 19.7
std . 1.2 1.4 .87 1.33 2.2 1.4 2.3 2.1 1.8
n . 21 21 21 21 21 21 21 21 21
;
data g2(type=corr);
Input _type_ $ 1-8 _name_ $ 9-11 x1-x9;
datalines;
corr x1 1. . . . . . . . .
corr x2 .733 1. . . . . . . .
corr x3 .576 .388 1. . . . . . .
corr x4 .209 .414 .425 1. . . . . .
corr x5 .412 .286 .461 .398 1. . . . .
corr x6 .323 .399 .212 .302 .522 1. . . .
corr x7 .215 .295 .188 .467 .334 .232 1. . .
corr x8 .204 .257 .462 .522 .298 .355 .372 1. .
corr x9 .245 .272 .177 .301 .156 .246 .578 .422 1.
mean . 22.1 19.8 16.9 23.3 21.9 17.3 17.9 19.1 19.8
std . 1.3 1.3 .99 1.25 2.1 1.3 2.2 2.0 1.5
n . 22 22 22 22 22 22 22 22 22
;
data g3(type=corr);
Input _type_ $ 1-8 _name_ $ 9-11 x1-x9;
datalines;
corr x1 1. . . . . . . . .
corr x2 .699 1. . . . . . . .
corr x3 .488 .328 1. . . . . . .
corr x4 .235 .398 .413 1. . . . . .
corr x5 .377 .265 .471 .376 1. . . . .
corr x6 .335 .412 .265 .314 .503 1. . . .
corr x7 .243 .216 .192 .423 .369 .212 1. . .
corr x8 .217 .292 .423 .525 .219 .317 .376 1. .
corr x9 .211 .283 .152 .285 .147 .135 .633 .579 1.
mean . 22.2 20.9 15.4 25.1 22.6 16.3 19.3 20.2 19.5
std . 1.5 1.0 1.04 1.5 1.9 1.6 2.4 2.2 1.6
n . 20 20 20 20 20 20 20 20 20
;
Each of these data sets contains the information about the correlations, means, standard deviations, and sample sizes. Even though these data sets contain correlations, by default PROC CALIS analyzes the covariances and means.
The first hypothesis to test is the equality of covariance matrices and mean vectors:
where , , and are the population covariance matrices for the three independent groups, respectively, and , , and are the population mean vectors for the three independent groups, respectively.
The following statements specify this test:
proc calis covpattern=eqcovmat meanpattern=eqmeanvec;
var x1-x9;
group 1 / data=g1;
group 2 / data=g2;
group 3 / data=g3;
fitindex NoIndexType On(only)=[chisq df probchi rmsea aic caic sbc];
run;
In the PROC CALIS statement, the COVPATTERN=EQCOVMAT option specifies the same covariance matrix for the three groups and the MEANPATTERN=EQMEANVEC option specifies the same mean vector for the three groups. The VAR statement specifies that x1–9 are the variables in the hypothesis test. Next, the GROUP statements specify the data sets for the three independent groups. You use the FITINDEX statement to limit the amount of output fit statistics to the quantities specified: the chi-square test (CHISQ), the degrees of freedom (DF), the significance value of the test statistic (PROBCHI), the root mean square error approximation (RMSEA), Akaike’s information criterion (AIC), consistent Akaike’s information criterion (CAIC), and Schwarz’s Bayesian criterion (SBC). The first three quantities are useful for the chi-square model fit test, while the rest of the fit indices are useful for comparing competing models for the data. Because there are not many quantities in this customized fit summary table, the NOINDEXTYPE option is used to suppress the printing of the fit index types.
Output 26.21.1 shows the general modeling information, including the sample sizes, the models for the groups, the model types, and the analysis types.
Output 26.21.1
Modeling Information for Testing Equality of Covariance and Mean Matrices
WORK.G1 |
21 |
Model 1 |
MSTRUCT |
Means and Covariances |
WORK.G2 |
22 |
Model 2 |
MSTRUCT |
Means and Covariances |
WORK.G3 |
20 |
Model 3 |
MSTRUCT |
Means and Covariances |
Output 26.21.2 shows the initial mean vector and the initial covariance matrix specifications for Model 1, which fits to Group 1. PROC CALIS generates the mean parameter names _mean_1, _mean_2, ..., and _mean_9 for the nine elements in the mean vector. It also generates the covariance parameter names _cov_1_1, _cov_2_1, ..., and _cov_9_9 for the nonredundant elements in the covariance matrix.
Output 26.21.2
Initial Mean Vector and Covariance Matrix for Model 1
_mean_1 |
. |
_mean_2 |
. |
_mean_3 |
. |
_mean_4 |
. |
_mean_5 |
. |
_mean_6 |
. |
_mean_7 |
. |
_mean_8 |
. |
_mean_9 |
. |
Although not shown here, the initial mean vector and covariance matrices for Models 2 and 3 are exactly the same as those shown in Output 26.21.2, as required by the equality of covariance and mean matrices in the null hypothesis .
Output 26.21.3 shows the customized fit summary table. The chi-square test statistic is . The degrees of freedom is and the -value is less than . Therefore, the hypothesis of equality in covariance and mean matrices is rejected for the three independent groups. The RMSEA index is much greater than , which does not indicate a good model fit. Other fit indices such as AIC, CAIC, and SBC are not interpreted for the fit of the model itself, but are useful for comparing competing models in the later discussion.
Output 26.21.3
Fit Summary for Testing : Equality of Covariance and Mean Matrices
203.2605 |
108 |
<.0001 |
0.2100 |
311.2605 |
480.9897 |
426.9897 |
A less restrictive hypothesis is now considered. This hypothesis states the equality of covariance matrices only:
differs from in that the population means in are not constrained. To test this hypothesis, you need to change the MEANPATTERN= option to use the SATURATED keyword, as shown in the following statements:
proc calis covpattern=eqcovmat meanpattern=saturated;
var x1-x9;
group 1 / data=g1;
group 2 / data=g2;
group 3 / data=g3;
fitindex NoIndexType On(only)=[chisq df probchi rmsea aic caic sbc];
run;
Output 26.21.4 shows the results of the testing .
Output 26.21.4
Fit Summary for Testing : Equality of Covariance Matrices but Unconstrained Means
26.7897 |
90 |
1.0000 |
0.0000 |
170.7897 |
397.0954 |
325.0954 |
The chi-square test statistic is (=, =). You cannot reject this null hypothesis about the equality of the population covariance matrices. The RMSEA value is virtually zero, which indicates a perfect fit. Comparing the models under and , it is clear that the three groups are significantly different with regard to their mean vectors. By relaxing all the equality constraints on the means in , is derived and is supported by the chi-square test. In addition, the RMSEA value for the model under is perfect. Because lower values of AIC, CAIC, and SBC values indicate better model fit (with the model complexity taken into account), these indices in Output 26.21.3 and Output 26.21.4 support that the model under is better than .
However, in getting a superior model fit, might have relaxed more constraints than absolutely necessary for an optimal fit. That is, it might be possible to impose equality constraints on only some (but not all, as in ) of the means to reach the same or even better model fit (by the RMSEA, AIC, CAIC, or SBC criterion) than the model under . But how can you determine this set of constrained means?
To answer this question, you conduct an exploratory analysis of the data by using some model modification techniques. Models established from exploratory analysis should be validated by external data in the future. However, this example demonstrates the exploratory part only.
Beginning with the model under , you can manually take away some particular constraints on the means and explore whether the revised model improves the fit. If the revised model fits better, you can repeat the process until you cannot improve more. Ultimately, you might be able to find the "best" model between the models specified under and . Such an exploratory analysis is laborious, considering the vast possibilities of constraints on the nine variable means in three independent groups that you could attempt to release. Fortunately, PROC CALIS provides some model modification statistics, called the LM (Lagrange multiplier) statistics, to assist this kind of exploratory analysis.
The following statements specify the model under , but now with the MODIFICATION option added to the PROC CALIS statement:
proc calis covpattern=eqcovmat meanpattern=eqmeanvec modification;
var x1-x9;
group 1 / data=g1;
group 2 / data=g2;
group 3 / data=g3;
fitindex NoIndexType On(only)=[chisq df probchi rmsea aic caic sbc];
run;
The MODIFICATION option requests the so-called LM (Lagrange multiplier) statistics for releasing the parameter constraints. These constraints include the cross-group or within-group constraints and the fixed values in the model. For the model under , the covariances and the means are all constrained across groups. These are the equality constraints that you would like to release to obtain a better model fit. Output 26.21.5 shows the results of the LM statistics for releasing these equality constraints in variances, covariances, and means.
Output 26.21.5
Lagrange Multiplier Statistics for Releasing the Equality Constraints
0.01137 |
0.9151 |
0.0178 |
-0.0355 |
1.00150 |
0.3169 |
0.1729 |
-0.3212 |
1.28632 |
0.2567 |
-0.1818 |
0.3923 |
2.19353 |
0.1386 |
0.2038 |
-0.4076 |
0.77014 |
0.3802 |
-0.1253 |
0.2327 |
0.36128 |
0.5478 |
-0.0796 |
0.1718 |
3.12065 |
0.0773 |
-0.4344 |
0.8687 |
0.05704 |
0.8112 |
-0.0609 |
0.1132 |
4.14151 |
0.0418 |
0.4817 |
-1.0395 |
0.00672 |
0.9347 |
0.00888 |
-0.0178 |
2.23758 |
0.1347 |
-0.1681 |
0.3122 |
2.10455 |
0.1469 |
0.1512 |
-0.3264 |
2.18538 |
0.1393 |
-0.1940 |
0.3881 |
3.14532 |
0.0761 |
0.2416 |
-0.4487 |
0.10264 |
0.7487 |
-0.0405 |
0.0874 |
1.56813 |
0.2105 |
0.1815 |
-0.3630 |
0.66118 |
0.4161 |
0.1223 |
-0.2272 |
4.42160 |
0.0355 |
-0.2934 |
0.6332 |
0.31691 |
0.5735 |
-0.0667 |
0.1333 |
0.32615 |
0.5679 |
0.0702 |
-0.1304 |
0.0002277 |
0.9880 |
-0.00172 |
0.00371 |
0.73377 |
0.3917 |
0.1242 |
-0.2484 |
0.53196 |
0.4658 |
-0.1097 |
0.2038 |
0.01445 |
0.9043 |
-0.0168 |
0.0362 |
0.0000258 |
0.9959 |
0.000547 |
-0.00109 |
0.24892 |
0.6178 |
-0.0558 |
0.1036 |
0.25646 |
0.6126 |
0.0525 |
-0.1134 |
0.04412 |
0.8336 |
0.0361 |
-0.0722 |
0.52198 |
0.4700 |
0.1288 |
-0.2392 |
0.90948 |
0.3403 |
-0.1577 |
0.3403 |
0.0008607 |
0.9766 |
-0.00477 |
0.00953 |
0.01238 |
0.9114 |
0.0188 |
-0.0348 |
0.00712 |
0.9328 |
-0.0132 |
0.0285 |
0.10637 |
0.7443 |
-0.0649 |
0.1297 |
0.00631 |
0.9367 |
-0.0164 |
0.0304 |
0.16971 |
0.6804 |
0.0789 |
-0.1702 |
0.06645 |
0.7966 |
-0.0385 |
0.0771 |
0.0008275 |
0.9771 |
0.00446 |
-0.00829 |
0.05370 |
0.8167 |
0.0334 |
-0.0720 |
0.24212 |
0.6227 |
0.0809 |
-0.1617 |
0.04459 |
0.8328 |
-0.0360 |
0.0669 |
0.07959 |
0.7779 |
-0.0446 |
0.0963 |
0.01778 |
0.8939 |
-0.0431 |
0.0862 |
0.08223 |
0.7743 |
-0.0962 |
0.1787 |
0.18417 |
0.6678 |
0.1336 |
-0.2883 |
0.29558 |
0.5867 |
-0.0721 |
0.1442 |
0.26589 |
0.6061 |
-0.0710 |
0.1318 |
1.16570 |
0.2803 |
0.1378 |
-0.2974 |
0.00228 |
0.9619 |
-0.00780 |
0.0156 |
1.00319 |
0.3165 |
0.1697 |
-0.3152 |
0.95767 |
0.3278 |
-0.1538 |
0.3320 |
1.39116 |
0.2382 |
0.1513 |
-0.3027 |
0.08741 |
0.7675 |
-0.0394 |
0.0731 |
0.79586 |
0.3723 |
-0.1102 |
0.2378 |
0.46031 |
0.4975 |
-0.0947 |
0.1894 |
0.04254 |
0.8366 |
0.0299 |
-0.0555 |
0.22665 |
0.6340 |
0.0640 |
-0.1381 |
0.14991 |
0.6986 |
-0.0700 |
0.1399 |
0.04723 |
0.8280 |
0.0408 |
-0.0757 |
0.02874 |
0.8654 |
0.0295 |
-0.0636 |
0.22550 |
0.6349 |
0.1079 |
-0.2158 |
0.04390 |
0.8340 |
0.0494 |
-0.0918 |
0.48451 |
0.4864 |
-0.1523 |
0.3286 |
0.50774 |
0.4761 |
0.1203 |
-0.2406 |
0.01246 |
0.9111 |
-0.0196 |
0.0363 |
0.36926 |
0.5434 |
-0.0988 |
0.2131 |
0.01235 |
0.9115 |
0.0228 |
-0.0455 |
0.16400 |
0.6855 |
-0.0861 |
0.1598 |
0.09159 |
0.7622 |
0.0597 |
-0.1288 |
0.16844 |
0.6815 |
-0.0644 |
0.1288 |
0.15095 |
0.6976 |
0.0633 |
-0.1175 |
0.0003079 |
0.9860 |
0.00265 |
-0.00572 |
0.22542 |
0.6349 |
-0.0776 |
0.1551 |
0.00754 |
0.9308 |
0.0147 |
-0.0273 |
0.15376 |
0.6950 |
0.0617 |
-0.1331 |
0.07831 |
0.7796 |
-0.0631 |
0.1262 |
0.07552 |
0.7835 |
0.0643 |
-0.1195 |
3.293E-6 |
0.9986 |
0.000394 |
-0.00085 |
0.13810 |
0.7102 |
0.0726 |
-0.1452 |
0.0001086 |
0.9917 |
0.00211 |
-0.00392 |
0.14999 |
0.6985 |
-0.0729 |
0.1572 |
0.09334 |
0.7600 |
0.1051 |
-0.2101 |
0.00128 |
0.9714 |
0.0128 |
-0.0237 |
0.11994 |
0.7291 |
-0.1147 |
0.2474 |
0.04800 |
0.8266 |
0.0353 |
-0.0706 |
0.19725 |
0.6569 |
0.0743 |
-0.1379 |
0.45888 |
0.4981 |
-0.1051 |
0.2268 |
0.13689 |
0.7114 |
0.0727 |
-0.1453 |
0.31671 |
0.5736 |
-0.1147 |
0.2130 |
0.04084 |
0.8398 |
0.0382 |
-0.0825 |
0.37615 |
0.5397 |
-0.0904 |
0.1808 |
0.00452 |
0.9464 |
-0.0103 |
0.0191 |
0.47678 |
0.4899 |
0.0980 |
-0.2114 |
0.00989 |
0.9208 |
0.0150 |
-0.0300 |
0.01001 |
0.9203 |
0.0157 |
-0.0291 |
0.04138 |
0.8388 |
-0.0296 |
0.0638 |
0.01378 |
0.9066 |
-0.0267 |
0.0533 |
0.03154 |
0.8590 |
-0.0419 |
0.0778 |
0.09063 |
0.7634 |
0.0659 |
-0.1421 |
0.0007193 |
0.9786 |
0.00510 |
-0.0102 |
0.01293 |
0.9095 |
0.0224 |
-0.0417 |
0.02067 |
0.8857 |
-0.0263 |
0.0568 |
0.16543 |
0.6842 |
0.0952 |
-0.1904 |
0.29902 |
0.5845 |
-0.1328 |
0.2467 |
0.02206 |
0.8819 |
0.0335 |
-0.0722 |
0.00581 |
0.9392 |
0.0244 |
-0.0487 |
0.00694 |
0.9336 |
-0.0276 |
0.0513 |
0.0000660 |
0.9935 |
0.00250 |
-0.00539 |
0.19272 |
0.6607 |
-0.0532 |
0.1063 |
0.01910 |
0.8901 |
-0.0174 |
0.0323 |
0.34408 |
0.5575 |
0.0684 |
-0.1476 |
0.09017 |
0.7640 |
-0.0446 |
0.0892 |
0.26496 |
0.6067 |
0.0794 |
-0.1474 |
0.04994 |
0.8232 |
-0.0320 |
0.0690 |
0.44236 |
0.5060 |
0.0758 |
-0.1516 |
0.12761 |
0.7209 |
-0.0422 |
0.0784 |
0.09470 |
0.7583 |
-0.0338 |
0.0728 |
0.04619 |
0.8298 |
0.0260 |
-0.0520 |
0.22996 |
0.6316 |
-0.0602 |
0.1117 |
0.07502 |
0.7842 |
0.0319 |
-0.0688 |
0.02807 |
0.8669 |
0.0279 |
-0.0557 |
0.0006585 |
0.9795 |
-0.00443 |
0.00823 |
0.02058 |
0.8859 |
-0.0230 |
0.0496 |
0.03989 |
0.8417 |
-0.0282 |
0.0563 |
0.15069 |
0.6979 |
-0.0568 |
0.1055 |
0.36051 |
0.5482 |
0.0815 |
-0.1759 |
0.03398 |
0.8537 |
-0.0284 |
0.0567 |
0.05802 |
0.8097 |
0.0385 |
-0.0714 |
0.00362 |
0.9520 |
-0.00891 |
0.0192 |
0.06050 |
0.8057 |
-0.0391 |
0.0781 |
0.56151 |
0.4537 |
0.1235 |
-0.2294 |
0.26945 |
0.6037 |
-0.0794 |
0.1713 |
0.13296 |
0.7154 |
-0.0655 |
0.1310 |
0.00130 |
0.9712 |
-0.00673 |
0.0125 |
0.16526 |
0.6844 |
0.0703 |
-0.1517 |
11.09173 |
0.0009 |
0.3453 |
-0.6906 |
1.21196 |
0.2709 |
-0.1184 |
0.2200 |
5.04550 |
0.0247 |
-0.2242 |
0.4838 |
21.46921 |
<.0001 |
-0.5837 |
1.1675 |
15.27776 |
<.0001 |
0.5110 |
-0.9490 |
0.47301 |
0.4916 |
0.0834 |
-0.1800 |
4.41967 |
0.0355 |
-0.2034 |
0.4067 |
6.37770 |
0.0116 |
-0.2535 |
0.4708 |
22.27732 |
<.0001 |
0.4395 |
-0.9485 |
3.26860 |
0.0706 |
0.1904 |
-0.3807 |
0.03260 |
0.8567 |
0.0197 |
-0.0366 |
4.06935 |
0.0437 |
-0.2045 |
0.4413 |
0.22210 |
0.6374 |
-0.0681 |
0.1362 |
1.50172 |
0.2204 |
0.1837 |
-0.3412 |
0.60673 |
0.4360 |
-0.1083 |
0.2338 |
1.61486 |
0.2038 |
0.1539 |
-0.3078 |
6.72912 |
0.0095 |
-0.3260 |
0.6055 |
1.88248 |
0.1701 |
0.1600 |
-0.3452 |
0.14035 |
0.7079 |
-0.0558 |
0.1116 |
0.11034 |
0.7398 |
0.0514 |
-0.0954 |
0.00153 |
0.9688 |
0.00560 |
-0.0121 |
0.12603 |
0.7226 |
-0.0510 |
0.1019 |
1.96607 |
0.1609 |
0.2089 |
-0.3880 |
1.16200 |
0.2811 |
-0.1490 |
0.3215 |
0.05301 |
0.8179 |
0.0248 |
-0.0496 |
0.97965 |
0.3223 |
-0.1106 |
0.2054 |
0.61083 |
0.4345 |
0.0810 |
-0.1748 |
To use the results of this table, you look for parameters that have large LM statistics (in the LM Stat column). Equivalently, you can look for parameters that have small -values (in the Pr > ChiSq column). Loosely speaking, an LM statistic estimates the reduction of model fit chi-square statistic if you release the constraint on the corresponding parameter. The -value indicates whether the improvement would be significant. Therefore, releasing those parameters with a high LM statistic and small -value would be the key to model improvements. Bear in mind that the LM statistics are linear approximations and they might not be very accurate as estimates of the actual model improvement, which could only be accessed when you refit the model with the particular constraint released. Nonetheless, the LM statistics could still be very useful because they show which constraints could potentially improve the model the most.
Output 26.21.5 shows the results from releasing the constraints on the variances and covariances first. Each constrained element of the covariance matrix has three rows, respectively, for the three models (or groups). For example, the first parameter is _cov_1_1, which is the same variance parameter for x1 in the three models. The first row shows that if you release the variance of x1 in Model 1 from the constraint (while keeping the variances of x1 being constrained between Models 2 and 3), the LM statistic is , and the corresponding -value is . This means that the model fit improvement would be very small and so you do not expect a significant model fit improvement by releasing this constraint. The columns entitled "Changes" show the estimated parameter changes in the original parameters (that is, _cov_1_1 for Models 2 and 3) and in the released parameter (that is, the new parameter for the variance of x1 in Model 1) if you release the corresponding equality constraint. These two "Changes" columns are not very useful for the present purpose.
Looking through the results for the variance and covariance constraints, you can see that almost all the associated -values are large (that is, as compared with the conventional level for significance). Therefore, all these constraints on variances and covariances would not improve the model fit significantly. In contrast, the constraints on the means show that several of them could be released for a sizable model fit improvement. The largest LM statistic in the table is the one for _mean_3 in Model 3. The LM statistic is and its corresponding -value is less than . This means that if the mean of x3 in Model 3 were not constrained with the means of x3 in Models 1 and 2, you would have expected a reduction in the model fit chi-square statistic that is estimated at . Other notable LM statistics are those for _mean_1 in Model 1, _mean_2 in Model 1 or 2, and _mean_6 in Model 2.
Two important points are noted about the use of the LM statistics. First, the LM statistics are not additive. You cannot expect that the total reduction in model fit chi-square for releasing a particular set of parameter constraints is the sum of the corresponding LM statistics. Second, once you release a particular constraint and refit the model, the LM statistics in the revised model might not follow the same pattern as those LM statistics in the original model. Basically, these are due to the nonlinearity of the fit function and the dependence of the parameter estimates. Therefore, in order to find the best model for the data, it would be more sensible to adopt a one-at-a-time approach to release the constraints. That is, you release one constraint at a time and refit the model to see if you can release more constraints to improve the model fit.
According to the results of LM statistics in Output 26.21.5, you first release the constraint on the _mean_3 parameter, which is for the mean of x3 in Model 3. The following statements fit such a model:
proc calis modification;
var x1-x9;
group 1 / data=g1;
group 2 / data=g2;
group 3 / data=g3;
model 1 / group = 1;
mstruct;
matrix _cov_ = cov01-cov45;
matrix _mean_ = mean1-mean9;
model 2 / group = 2;
refmodel 1;
model 3 / group = 3;
refmodel 1;
renameparm mean3=mean3_mdl3;
fitindex NoIndexType On(only)=[chisq df probchi rmsea aic caic sbc];
run;
Because the revised model is no longer a supported built-in MSTRUCT model, you cannot use the MEANPATTERN= or the COVPATTERN= options any more. Instead, you now use the MSTRUCT modeling language to specify the covariance and mean patterns. Model 1, which fits to Group 1, is an MSTRUCT model with variance and covariance parameters cov01–cov45 and mean parameters mean1–mean9. Model 2, which fits to Group 2, refers to the specifications of Model 1, as indicated in a REFMODEL statement. Hence, Model 1 and Model 2 are completely constrained in variances, covariances, and means. Model 3, which fits to Group 3, also refers to the specifications of Model 1, as indicated in another REFMODEL statement. However, the RENAMEPARM statement renames the parameter mean3 in the reference model (that is, Model 1) to a new name mean3_mdl3. As a results, all variance, covariance, and mean parameters except one in Model 3 are constrained to be the same as those in Model 1. The mean of x3 in Model 3 is the only parameter that is not constrained with any other parameters. This forms the first revised model from . The MODIFICATION option is specified again to determine whether a further model fit improvement is possible.
Output 26.21.6 shows the modeling information of the first revised model. It shows that Models 2 and 3 make references to Model 1. Therefore, parameters between models are constrained by referencing.
Output 26.21.6
Modeling Information for The First Revised Model
WORK.G1 |
21 |
Model 1 |
MSTRUCT |
|
Means and Covariances |
WORK.G2 |
22 |
Model 2 |
MSTRUCT |
Model 1 |
Means and Covariances |
WORK.G3 |
20 |
Model 3 |
MSTRUCT |
Model 1 |
Means and Covariances |
Output 26.21.7 shows the initial specifications of the means, variances, and covariances in Model 1.
Output 26.21.7
Initial Mean Vector and Covariance Matrix for Model 1 in the First Revised Model
mean1 |
. |
mean2 |
. |
mean3 |
. |
mean4 |
. |
mean5 |
. |
mean6 |
. |
mean7 |
. |
mean8 |
. |
mean9 |
. |
Output 26.21.8 shows the initial specifications of the means in Model 2. The mean parameters in Model 2 are exactly the same as those in Model 1, as shown in Output 26.21.7. The variance and covariance parameters in Model 2 are also exactly the same as those in Model 1, but are not shown here to conserve space.
Output 26.21.8
Initial Mean Vector for Model 2 in the First Revised Model
mean1 |
. |
mean2 |
. |
mean3 |
. |
mean4 |
. |
mean5 |
. |
mean6 |
. |
mean7 |
. |
mean8 |
. |
mean9 |
. |
Output 26.21.9 shows the initial specifications of the means in Model 3. All but one mean parameter in Model 3 are exactly the same as those in Models 1 and 2, as shown in Output 26.21.7 and Output 26.21.8, respectively. The mean for x3 in Model 3 is mean3_mdl3, which is now a distinct parameter, and therefore it is not constrained with any other parameters in the first or the second models for Groups 1 or 2. However, the variance and covariance parameters in Model 3 are exactly the same as those in Model 1. They are not shown here to conserve space.
Output 26.21.9
Initial Mean Vector for Model 3 in the First Revised Model
mean1 |
. |
mean2 |
. |
mean3_mdl3 |
. |
mean4 |
. |
mean5 |
. |
mean6 |
. |
mean7 |
. |
mean8 |
. |
mean9 |
. |
Output 26.21.10 shows the fit summary of the first revised model. The model fit chi-square is , which drops quite a bit from the original model under . The -value of the model fit chi-square is , which is statistically significant. The RMSEA value is , which is also a sizable improvement. All the AIC, CAIC, and SBC values are reduced, indicating better model fit than the model under .
Output 26.21.10
Fit Summary for the First Revised Model
148.8865 |
107 |
0.0046 |
0.1399 |
258.8865 |
431.7589 |
376.7589 |
Output 26.21.11 shows the LM statistics for releasing the equality constraints in the first revised model. Almost all of the results for the variance and covariance constraints are omitted because their LM statistics are not significant. However, Output 26.21.11 shows all the LM statistics for releasing the constraints in means. The mean of x2 in Model 2 has the largest LM statistic at .
Output 26.21.11
LM Statistics for Releasing the Equality Constraints in the First Revised Model
0.64995 |
0.4201 |
0.1050 |
-0.2100 |
0.41761 |
0.5181 |
0.0874 |
-0.1622 |
2.18920 |
0.1390 |
-0.1855 |
0.4004 |
. |
|
|
|
. |
|
|
|
. |
|
|
|
9.26683 |
0.0023 |
0.2872 |
-0.5745 |
3.00586 |
0.0830 |
-0.1702 |
0.3160 |
2.13803 |
0.1437 |
-0.1481 |
0.3196 |
26.25110 |
<.0001 |
-0.6568 |
1.3135 |
12.34633 |
0.0004 |
0.4674 |
-0.8680 |
2.52684 |
0.1119 |
0.1962 |
-0.4234 |
0.58886 |
0.4429 |
-0.0787 |
0.0828 |
0.58886 |
0.4429 |
0.0828 |
-0.0787 |
6.59009 |
0.0103 |
0.2746 |
-0.5493 |
0.51348 |
0.4736 |
0.0796 |
-0.1478 |
11.61626 |
0.0007 |
-0.3586 |
0.7739 |
0.52966 |
0.4668 |
-0.1042 |
0.2084 |
0.22296 |
0.6368 |
0.0702 |
-0.1304 |
0.06887 |
0.7930 |
0.0374 |
-0.0807 |
1.16656 |
0.2801 |
0.1270 |
-0.2540 |
5.29612 |
0.0214 |
-0.2810 |
0.5219 |
1.69419 |
0.1930 |
0.1518 |
-0.3275 |
0.03791 |
0.8456 |
-0.0291 |
0.0582 |
0.44509 |
0.5047 |
0.1036 |
-0.1923 |
0.23803 |
0.6256 |
-0.0704 |
0.1520 |
0.39418 |
0.5301 |
-0.0883 |
0.1765 |
0.24234 |
0.6225 |
0.0719 |
-0.1335 |
0.01950 |
0.8890 |
0.0200 |
-0.0431 |
0.00156 |
0.9685 |
0.00423 |
-0.00846 |
1.06869 |
0.3012 |
-0.1150 |
0.2136 |
1.05212 |
0.3050 |
0.1065 |
-0.2297 |
You now modify the preceding statements to specify the second revised model, as shown in the following statements:
proc calis modification;
var x1-x9;
group 1 / data=g1;
group 2 / data=g2;
group 3 / data=g3;
model 1 / group = 1;
mstruct;
matrix _cov_ = cov01-cov45;
matrix _mean_ = mean1-mean9;
model 2 / group = 2;
refmodel 1;
renameparm mean2=mean2_new; /* constraint a */
model 3 / group = 3;
refmodel 1;
renameparm mean2=mean2_new, /* constraint a */
mean3=mean3_mdl3;
fitindex NoIndexType On(only)=[chisq df probchi rmsea aic caic sbc];
run;
This second revised model must not constrain the mean of x2 in Model 1 with any parameters. A straightforward way to do this is to rename the mean2 parameter to a unique name in Model 1. However, for the current specification it is more convenient to rename the mean2 parameter in Models 2 and 3 to another name. In the specification of the second revised model, Models 2 and 3 still make references to Model 1. However, in the respective RENAMEPARM statements, both Model 2 and 3 rename the mean2 parameter that is referenced from Model 1 to the new name mean2_new. This way the mean for x2 in Model 1 is not constrained with the means of x2 in Models 2 and 3. But the means for x2 in Models 2 and 3 are still constrained to be equal by the same parameter mean2_new. Output 26.21.12 shows the fit summary of the second revised model.
Output 26.21.12
Fit Summary for the Second Revised Model
86.3927 |
106 |
0.9183 |
0.0000 |
198.3927 |
374.4083 |
318.4083 |
Again, a sizable improvement over the first revised model is shown in the second revised model. The model fit chi-square statistic is no longer significant (=), and the RMSEA value is perfect at 0. Large drops in the AIC, CAIC, and SBC values are also observed.
Output 26.21.13 suggests that the mean of x6 in Model 2 (which has the largest LM statistic at ) could be released from the equality constraints to achieve the largest model improvement over the current model.
Output 26.21.13
LM Statistics for Releasing the Equality Constraints in the Second Revised Model
2.77017 |
0.0960 |
0.1384 |
-0.2769 |
0.28728 |
0.5920 |
0.0463 |
-0.0859 |
5.00080 |
0.0253 |
-0.1791 |
0.3864 |
. |
|
|
|
. |
|
|
|
. |
|
|
|
2.75497 |
0.0970 |
0.1646 |
-0.3293 |
3.21108 |
0.0731 |
-0.1511 |
0.2806 |
0.24911 |
0.6177 |
0.0424 |
-0.0915 |
0.74340 |
0.3886 |
-0.0877 |
0.0934 |
0.74340 |
0.3886 |
0.0934 |
-0.0877 |
6.17507 |
0.0130 |
0.2672 |
-0.5343 |
0.02088 |
0.8851 |
-0.0146 |
0.0272 |
4.71373 |
0.0299 |
-0.2072 |
0.4470 |
1.65520 |
0.1983 |
-0.1853 |
0.3706 |
1.16123 |
0.2812 |
0.1606 |
-0.2982 |
0.04040 |
0.8407 |
0.0287 |
-0.0618 |
5.03837 |
0.0248 |
0.2712 |
-0.5423 |
11.41259 |
0.0007 |
-0.4217 |
0.7831 |
1.51178 |
0.2189 |
0.1460 |
-0.3150 |
0.32382 |
0.5693 |
-0.0853 |
0.1706 |
0.82183 |
0.3646 |
0.1410 |
-0.2619 |
0.12512 |
0.7235 |
-0.0512 |
0.1104 |
2.39206 |
0.1220 |
-0.2210 |
0.4420 |
1.58297 |
0.2083 |
0.1867 |
-0.3467 |
0.08639 |
0.7688 |
0.0427 |
-0.0922 |
0.00682 |
0.9342 |
0.00886 |
-0.0177 |
1.20949 |
0.2714 |
-0.1225 |
0.2274 |
1.10018 |
0.2942 |
0.1089 |
-0.2349 |
4.47808 |
0.0343 |
0.2983 |
-0.2661 |
4.47808 |
0.0343 |
-0.2661 |
0.2983 |
The process of model refitting should now become familiar. You modify the previous model to release the constraint on the mean of x6 in Model 2. As a result, the third revised model is specified by the following statements:
proc calis modification;
var x1-x9;
group 1 / data=g1;
group 2 / data=g2;
group 3 / data=g3;
model 1 / group = 1;
mstruct;
matrix _cov_ = cov01-cov45;
matrix _mean_ = mean1-mean9;
model 2 / group = 2;
refmodel 1;
renameparm mean2=mean2_new, /* constraint a */
mean6=mean6_mdl2;
model 3 / group = 3;
refmodel 1;
renameparm mean2=mean2_new, /* constraint a */
mean3=mean3_mdl3;
fitindex NoIndexType On(only)=[chisq df probchi rmsea aic caic sbc];
run;
The only modification from the previous specification is to rename mean6 to mean6_mdl2 in the RENAMEPARM statement of Model 2. Output 26.21.14 shows the model fit summary of the third revised model.
Output 26.21.14
Fit Summary for the Third Revised Model
68.7869 |
105 |
0.9976 |
0.0000 |
182.7869 |
361.9456 |
304.9456 |
The model improvement over the second revised model is still notable in the third revised model. The chi-square value drops about 20 points in the third revised model. The AIC, CAIC, and the SBC values are reduced notably, though not as impressively as with the previous improvements.
Output 26.21.15 suggests that the mean of x4 in Model 1 (which has the largest LM statistic at ) could be released from the equality constraint to improve model fit further.
Output 26.21.15
LM Statistics for Releasing the Equality Constraints in the Third Revised Model
2.43365 |
0.1188 |
0.1342 |
-0.2684 |
0.19040 |
0.6626 |
0.0389 |
-0.0724 |
4.11405 |
0.0425 |
-0.1680 |
0.3624 |
. |
|
|
|
. |
|
|
|
. |
|
|
|
6.15753 |
0.0131 |
0.2550 |
-0.5101 |
6.05749 |
0.0138 |
-0.2109 |
0.3917 |
0.29286 |
0.5884 |
0.0463 |
-0.0999 |
2.89778 |
0.0887 |
-0.1796 |
0.1889 |
2.89778 |
0.0887 |
0.1889 |
-0.1796 |
7.01943 |
0.0081 |
0.2850 |
-0.5701 |
0.04915 |
0.8245 |
-0.0226 |
0.0419 |
5.05137 |
0.0246 |
-0.2148 |
0.4635 |
0.21229 |
0.6450 |
-0.0672 |
0.1345 |
0.07499 |
0.7842 |
-0.0443 |
0.0822 |
0.55019 |
0.4582 |
0.1059 |
-0.2285 |
0.07011 |
0.7912 |
0.0503 |
-0.0486 |
0.07011 |
0.7912 |
-0.0486 |
0.0503 |
0.98902 |
0.3200 |
-0.1513 |
0.3025 |
2.42349 |
0.1195 |
0.2463 |
-0.4575 |
0.34228 |
0.5585 |
-0.0857 |
0.1850 |
1.58469 |
0.2081 |
-0.1786 |
0.3572 |
0.81644 |
0.3662 |
0.1347 |
-0.2502 |
0.14494 |
0.7034 |
0.0549 |
-0.1184 |
0.13501 |
0.7133 |
0.0398 |
-0.0797 |
2.54369 |
0.1107 |
-0.1796 |
0.3335 |
1.61691 |
0.2035 |
0.1337 |
-0.2886 |
3.21187 |
0.0731 |
0.2484 |
-0.2280 |
3.21187 |
0.0731 |
-0.2280 |
0.2484 |
To make the mean parameter for x4 in Model 1 unique, the mean parameters for x4 in Models 2 and 3 are renamed from mean4 to mean4_new, as shown in the following statements:
proc calis modification;
var x1-x9;
group 1 / data=g1;
group 2 / data=g2;
group 3 / data=g3;
model 1 / group = 1;
mstruct;
matrix _cov_ = cov01-cov45;
matrix _mean_ = mean1-mean9;
model 2 / group = 2;
refmodel 1;
renameparm mean2=mean2_new, /* constraint a */
mean4=mean4_new, /* constraint b */
mean6=mean6_mdl2;
model 3 / group = 3;
refmodel 1;
renameparm mean2=mean2_new, /* constraint a */
mean3=mean3_mdl3,
mean4=mean4_new; /* constraint b */
fitindex NoIndexType On(only)=[chisq df probchi rmsea aic caic sbc];
run;
This forms the fourth revised model. Output 26.21.16 shows the fit summary of this revised model. Again, the chi-square, AIC, CAIC, and SBC values all show improvements, as compared with the third revised model. However, the improvements do seem to slow down. For example, the CAIC value drops from to the current value at —a mere 3 points reduction. The SBC value drops from to the current value at —a mere 4 points reduction. These small reductions indicate that you might soon reach the point that no more model fit improvement would be possible with additional release of parameter constraints.
Output 26.21.16
Fit Summary for the Fourth Revised Model
60.1265 |
104 |
0.9998 |
0.0000 |
176.1265 |
358.4283 |
300.4283 |
Output 26.21.17 suggests that the mean of x1 in Model 1 (which has the largest LM statistic at ) could be released from the equality constraint to achieve the largest model improvement over the current model.
Output 26.21.17
LM Statistics for Releasing the Equality Constraints in the Fourth Revised Model
2.60531 |
0.1065 |
0.1375 |
-0.2751 |
0.28124 |
0.5959 |
0.0469 |
-0.0871 |
4.75010 |
0.0293 |
-0.1788 |
0.3859 |
. |
|
|
|
. |
|
|
|
. |
|
|
|
6.45759 |
0.0110 |
0.2616 |
-0.5232 |
5.00966 |
0.0252 |
-0.1921 |
0.3568 |
0.05927 |
0.8076 |
0.0209 |
-0.0451 |
1.53288 |
0.2157 |
-0.1298 |
0.1406 |
1.53288 |
0.2157 |
0.1406 |
-0.1298 |
0.09750 |
0.7549 |
-0.0457 |
0.0913 |
0.19683 |
0.6573 |
-0.0716 |
0.1330 |
0.56575 |
0.4520 |
0.1070 |
-0.2310 |
0.35797 |
0.5496 |
0.1141 |
-0.1113 |
0.35797 |
0.5496 |
-0.1113 |
0.1141 |
4.55702E-6 |
0.9983 |
0.000351 |
-0.00070 |
0.96359 |
0.3263 |
0.1572 |
-0.2920 |
1.00887 |
0.3152 |
-0.1486 |
0.3208 |
0.20289 |
0.6524 |
-0.0676 |
0.1352 |
0.12448 |
0.7242 |
0.0525 |
-0.0974 |
0.00590 |
0.9388 |
0.0110 |
-0.0237 |
0.05894 |
0.8082 |
-0.0271 |
0.0542 |
1.63722 |
0.2007 |
-0.1448 |
0.2689 |
2.44244 |
0.1181 |
0.1652 |
-0.3565 |
3.05066 |
0.0807 |
0.2396 |
-0.2246 |
3.05066 |
0.0807 |
-0.2246 |
0.2396 |
1.81990 |
0.1773 |
0.2306 |
-0.2003 |
1.81990 |
0.1773 |
-0.2003 |
0.2306 |
To make the mean parameter for x1 in Model 1 unique, the mean parameters for x1 in Models 2 and 3 are renamed from mean1 to mean1_new, as shown in the following statements:
proc calis modification;
var x1-x9;
group 1 / data=g1;
group 2 / data=g2;
group 3 / data=g3;
model 1 / group = 1;
mstruct;
matrix _cov_ = cov01-cov45;
matrix _mean_ = mean1-mean9;
model 2 / group = 2;
refmodel 1;
renameparm mean1=mean1_new, /* constraint c */
mean2=mean2_new, /* constraint a */
mean4=mean4_new, /* constraint b */
mean6=mean6_mdl2;
model 3 / group = 3;
refmodel 1;
renameparm mean1=mean1_new, /* constraint c */
mean2=mean2_new, /* constraint a */
mean3=mean3_mdl3,
mean4=mean4_new; /* constraint b */
fitindex NoIndexType On(only)=[chisq df probchi rmsea aic caic sbc];
run;
This forms the fifth revised model. Output 26.21.18 shows the fit summary of the fifth revised model. Again, the chi-square, AIC, CAIC, and SBC values all show improvements, as compared with the fourth revised model. However, the improvements slow down even more. For example, the CAIC value drops from to the current value at . The SBC value drops from to the current value at . Because the model fit does not improve much, this is the point where you would cease to release more equality constraints for improving the model fit.
Output 26.21.18
Fit Summary for the Fifth Revised Model
52.8821 |
103 |
1.0000 |
0.0000 |
170.8821 |
356.3270 |
297.3270 |
Output 26.21.19 does not suggest the release of any equality constraints on the means, because all the -values for the LM statistics are not significant (that is, all are greater than ). Therefore, the same suggestion from examining the model fit improvements of the fifth revised model echoes here: this is the point that the "best" model for the data is found.
Output 26.21.19
LM Statistics for Releasing the Equality Constraints in the Fifth Revised Model
4.06279 |
0.0438 |
0.1590 |
-0.3180 |
0.48735 |
0.4851 |
0.0571 |
-0.1061 |
7.60892 |
0.0058 |
-0.2095 |
0.4520 |
. |
|
|
|
. |
|
|
|
. |
|
|
|
0.08362 |
0.7725 |
-0.0312 |
0.0382 |
0.08362 |
0.7725 |
0.0382 |
-0.0312 |
0.02394 |
0.8771 |
0.0229 |
-0.0458 |
0.47064 |
0.4927 |
-0.1113 |
0.2067 |
0.26010 |
0.6101 |
0.0728 |
-0.1570 |
0.97515 |
0.3234 |
0.1893 |
-0.1893 |
0.97515 |
0.3234 |
-0.1893 |
0.1893 |
0.03746 |
0.8465 |
-0.0319 |
0.0638 |
1.10425 |
0.2933 |
0.1683 |
-0.3126 |
0.79472 |
0.3727 |
-0.1321 |
0.2851 |
0.86794 |
0.3515 |
-0.1426 |
0.2852 |
0.47498 |
0.4907 |
0.1038 |
-0.1928 |
0.03721 |
0.8470 |
0.0276 |
-0.0595 |
0.12190 |
0.7270 |
0.0401 |
-0.0801 |
2.66768 |
0.1024 |
-0.1869 |
0.3472 |
1.78113 |
0.1820 |
0.1417 |
-0.3058 |
1.28032 |
0.2578 |
-0.1794 |
0.1359 |
1.28032 |
0.2578 |
0.1359 |
-0.1794 |
2.53142 |
0.1116 |
0.2117 |
-0.2112 |
2.53142 |
0.1116 |
-0.2112 |
0.2117 |
2.25834 |
0.1329 |
0.2558 |
-0.2253 |
2.25834 |
0.1329 |
-0.2253 |
0.2558 |
To see where the fifth revised model (equality in the covariance matrix and partial equality in the means) stands between the models under (equality in the covariance and mean matrices) and (equality in the covariance matrix only), the following table shows the fit statistics of these three models:
|
|
"Fifth" |
|
Chi-square |
203.2605 |
52.8821 |
26.7897 |
Chi-square DF |
108 |
103 |
90 |
Pr > chi-square |
<0.0001 |
1.0000 |
1.0000 |
RMSEA estimate |
0.2100 |
0.0000 |
0.0000 |
Akaike information criterion |
311.2605 |
170.8821 |
170.7897 |
Bozdogan CAIC |
480.9898 |
356.3270 |
397.0954 |
Schwarz Bayesian criterion |
426.9898 |
297.3270 |
325.0954 |
The fifth revised model is labeled "Fifth" in the table. Compared with the model under , the fifth revised model is clearly superior. It uses only five more parameters (or five fewer degrees of freedom), but the improvement in the model fit chi-square and the RMSEA value are huge. The AIC, CAIC, and SBC are also much better.
Compared with the model under , the fifth revised model appears to be inferior in only the chi-square model fit statistic, although both models already have the highest possible -value at and smallest possible RMSEA value at 0. However, the model under uses 13 more parameters (or it has 13 fewer degrees of freedom), and hence it is more complex. In fact, because the model fit chi-square value does not take model complexity into account, it is often criticized as the basis for choosing competing models for the data. In contrast, the AIC, CAIC, and SBC measures take model complexity into account, and they are more reasonable as the basis for choosing competing models. Although the AIC values for the fifth revised model and the model under are very close, the CAIC and SBC values clearly favor the fifth revised model. Therefore, according to the CAIC and SBC criteria, the fifth revised model, which is a model with partial equality constraints on the means, is actually better than the model with all the means being unconstrained (that is, under ) for the current data with three independent groups.