When choosing a covariance structure in PROC MIXED, consider the covariance structures that are meaningful for your data and area of application. For example, when the time points at which measurements are taken are unequally spaced, and/or subjects are measured at different time points, the autoregressive (TYPE=AR(1)) structure is generally not appropriate. Guerin and Stroup (2000) provide more information on the effects of various covariance modeling decisions.
When there are several plausible covariance structures, it's desirable to choose one that best fits your data. Described below are three approaches for evaluating covariance structures — examining the fit statistics tables, constructing a likelihood ratio test, and using the COVTEST statement in PROC GLIMMIX. Comparison and selection of a covariance structure should be done before examining the fixed effects tests.
When examining the fit statistics, such as AICC or BIC, in the Fit Statistics table in the PROC MIXED output, smaller statistic values indicate a better fit to your data. Milliken and Johnson (1992) present a clinical trial study in which patients were randomly assigned to one of the four drugs. Heart rates were measured at four different time points following the administration of the drug. The following statements record the data and then restructure the data set to have one observation for each time point per patient.
data heart; input drug $ person hr1 hr2 hr3 hr4 @@; cards; ax23 1 72 86 81 77 bww9 2 85 86 83 80 ctrl 3 69 73 72 74 ax23 4 78 83 88 81 bww9 5 82 86 80 84 ctrl 6 66 62 67 73 ax23 7 71 82 81 75 bww9 8 71 78 70 75 ctrl 9 84 90 88 87 ax23 10 72 83 83 69 bww9 11 83 88 79 81 ctrl 12 80 81 77 72 ax23 13 66 79 77 66 bww9 14 86 85 76 76 ctrl 15 72 72 69 70 ax23 16 74 83 84 77 bww9 17 85 82 83 80 ctrl 18 65 62 65 61 ax23 19 62 73 78 70 bww9 20 79 83 80 81 ctrl 21 75 69 69 68 ax23 22 69 75 76 70 bww9 23 83 84 78 81 ctrl 24 71 70 65 63 ; data heart2; set heart; time=1; hr=hr1; output; time=2; hr=hr2; output; time=3; hr=hr3; output; time=4; hr=hr4; output; run;
Many covariance structures are reasonable for these data, such as unstructured (TYPE=UN), autoregressive (TYPE=AR(1)), compound symmetric (TYPE=CS), Toeplitz (TYPE=TOEP), etc. The unstructured covariance matrix is the most flexible since it imposes no pattern on the covariances. By using this structure and then examining the final covariance matrix for patterns characteristic of other structures, you may be able to select a simpler structure.
proc mixed data=heart2; class drug person time; model hr=drug time time*drug / ddfm=kr; repeated time / type=un subject=person r; run;
Following are the results of the R option and the Fit Statistics Table.
Based on the pattern of the covariances in the R matrix, the compound symmetric, autoregressive, and Toeplitz structures could be considered. Each of these structures is used in the following analyses. The ODS OUTPUT statement in each analysis saves the Fit Statistics and Dimensions tables to data sets.
proc mixed data=heart2; class drug person time; model hr=drug time time*drug / ddfm=kr; repeated time / type=cs subject=person r; ods output FitStatistics=FitCS(rename=(value=CS)) Dimensions=ParmCS(rename=(value=NumCS)); run; proc mixed data=heart2; class drug person time; model hr=drug time time*drug / ddfm=kr; repeated time / type=ar(1) subject=person r; ods output FitStatistics=FitAR1(rename=(value=AR1)) Dimensions=ParmAR1(rename=(value=NumAR1)); run; proc mixed data=heart2; class drug person time; model hr=drug time time*drug / ddfm=kr; repeated time / type=toep subject=person r; ods output FitStatistics=FitToep(rename=(value=Toep)) Dimensions=ParmToep(rename=(value=NumToep)); run; data all; merge FitCS FitAR1 FitToep; run; proc print data=all label noobs; run;
Data set ALL combines the fit statistics from the three models. The statistics are presented below. The small value of AICC (and BIC) for the AR(1) model suggests that it is the preferable model. The smaller-is-better rule applies even when values are negative. For example, -200 indicates a better model than -100.
This approach can be used whether the covariance structures are nested or not. As long as the model specification in the MODEL statement remains the same, different covariance structures for the model can be compared by this method. Comparison by the likelihood ratio test (presented next) requires that one structure be a special case of the other structure.
Suppose two models have the same MODEL statement, but different covariance structures in the REPEATED statement. If the covariance structure in one model is a special case of the covariance structure in the other model, you can construct a likelihood ratio test to compare the two models.
The following example compares the AR(1) and Toeplitz models above. Both structures have constant variance, but under the AR(1) structure the correlations change over time according to the power function. The Toeplitz structure does not have this requirement. Consequently, the AR(1) structure is a special case of the Toeplitz structure.
The DATA step below merges the data sets containing the -2 log likelihood values and the number of covariance parameters from the two models. The difference in the two -2 log likelihood values is the likelihood ratio statistic which is chi-square distributed. The difference in the number of covariance parameters between the two models is the degrees of freedom for the statistic. The p-value for the likelihood ratio test is computed using the PROBCHI function.
data result; merge FitAR1 FitToep ParmAR1 ParmToep; if _n_ = 1 then do; ChiAR1Toep=AR1-Toep; dfAR1Toep=NumToep-NumAR1; pAR1Toep=1-probchi(ChiAR1Toep, dfAR1Toep); output; stop; end; run; title 'Likelihood Ratio Test: AR1 vs Toeplitz'; proc print data=result label noobs; var ChiAR1Toep dfAR1Toep pAr1Toep; label ChiAR1Toep="Chi-Square" dfAR1Toep="DF" pAR1Toep="Pr > ChiSq"; run;
The nonsignificant likelihood ratio test indicates that there is no evidence to prefer the more general Toeplitz structure over the simpler AR(1) structure.
PROC GLIMMIX is a procedure for generalized linear mixed models, which includes the linear mixed model as a special case. Most models that can be fit by PROC MIXED can also be fit using PROC GLIMMIX. In PROC GLIMMIX, the COVTEST statement enables you to compare covariance structures which are linearly nested. Two covariance matrices are linearly nested if you can specify coefficients in the GENERAL option of the COVTEST statement which reduce the more general matrix to the simpler matrix. For example, the COVTEST statement can be used to compare unstructured and compound symmetric covariance matrices, because the equal variances and equal covariances constraints needed to reduce the unstructured covariance matrix to the compound symmetric matrix are linear. However, the COVTEST statement cannot be used to compare unstructured and AR(1) matrices, or to compare Toeplitz and AR(1) matrices, because the constraints needed to reduce the unstructured and Toeplitz structures to the AR(1) structure are not linear (the power function of the correlation in AR(1) is not a linear constraint). The COVTEST statement computes a likelihood ratio test to compare the more complex covariance structure specified in the RANDOM statement with the constrained structure specified in the COVTEST statement. This note provides more information about using the COVTEST statement.
The following example compares the unstructured and compound symmetric structures in the above model. PROC GLIMMIX does not have a REPEATED statement as in PROC MIXED. Instead, it provides the RANDOM _RESIDUAL_ statement that can be used in place of the REPEATED statement in PROC MIXED. The first three rows of coefficients in the GENERAL option of the COVTEST statement constrain the four variances in the unstructured matrix to be equal. Note that the position of a coefficient in a row corresponds to the position of the parameter in the "Covariance Parameter Estimates" table. The remaining rows constrain all of the covariances to be equal. With these linear constraints, an unstructured covariance matrix becomes a compound symmetric matrix. See the GLIMMIX documentation for details of the COVTEST syntax.
proc glimmix data=heart2; class drug person time; model hr=drug time time*drug / ddfm=kr; random _residual_ / type=un subject=person ; covtest 'CS' general 1 0 -1, 1 0 0 0 0 -1, 1 0 0 0 0 0 0 0 0 -1, 0 1 0 -1, 0 1 0 0 -1, 0 1 0 0 0 0 -1, 0 1 0 0 0 0 0 -1, 0 1 0 0 0 0 0 0 -1; run;
The results indicate that the compound symmetric structure fits your data adequately compared with the unstructured covariance matrix (p=0.1788).
DF: P-value based on a chi-square with DF degrees of freedom.
Guerin, L. and Stroup, W.W. (2000), "A simulation study to evaluate proc mixed analysis of repeated measures data," Proceedings of the 12th Annual Conference on Applied Statistics in Agriculture, Kansas State University, Manhattan, KS.
Milliken, G. A. and Johnson, D. E. (1992), Analysis of Messy Data, Volume 1: Designed Experiments, New York: Chapman and Hall.
|Product Family||Product||System||SAS Release|
|Microsoft® Windows® for 64-Bit Itanium-based Systems|
|Microsoft Windows Server 2003 Datacenter 64-bit Edition|
|Microsoft Windows Server 2003 Enterprise 64-bit Edition|
|Microsoft Windows XP 64-bit Edition|
|Microsoft® Windows® for x64|
|Microsoft Windows 95/98|
|Microsoft Windows 2000 Advanced Server|
|Microsoft Windows 2000 Datacenter Server|
|Microsoft Windows 2000 Server|
|Microsoft Windows 2000 Professional|
|Microsoft Windows NT Workstation|
|Microsoft Windows Server 2003 Datacenter Edition|
|Microsoft Windows Server 2003 Enterprise Edition|
|Microsoft Windows Server 2003 Standard Edition|
|Microsoft Windows Server 2008|
|Microsoft Windows XP Professional|
|Windows Millennium Edition (Me)|
|64-bit Enabled AIX|
|64-bit Enabled HP-UX|
|64-bit Enabled Solaris|
|ABI+ for Intel Architecture|
|Linux for x64|
|Linux on Itanium|
|OpenVMS on HP Integrity|
|Solaris for x64|
|Microsoft Windows Server 2003 for x64|
|Microsoft Windows Server 2008 for x64|
|Windows 7 Enterprise 32 bit|
|Windows 7 Enterprise x64|
|Windows 7 Home Premium 32 bit|
|Windows 7 Home Premium x64|
|Windows 7 Professional 32 bit|
|Windows 7 Professional x64|
|Windows 7 Ultimate 32 bit|
|Windows 7 Ultimate x64|
|Windows Vista for x64|
|Topic:||Analytics ==> Mixed Models|
SAS Reference ==> Procedures ==> GLIMMIX
SAS Reference ==> Procedures ==> MIXED
|Date Modified:||2010-11-10 12:46:59|
|Date Created:||2009-09-07 10:10:51|