The CALIS Procedure

Relationships among Estimation Criteria

Subsections:

Assumption of Multivariate Normality
Contribution of the Off-Diagonal Elements to the Estimation of Covariance or Correlation Structures
ML and FIML Methods

There is always some arbitrariness to classify the estimation methods according to certain mathematical or numerical properties. The discussion in this section is not meant to be a thorough classification of the estimation methods available in PROC CALIS. Rather, classification is done here with the purpose of clarifying the uses of different estimation methods and the theoretical relationships of estimation criteria.

Assumption of Multivariate Normality

GLS, ML, and FIML assume multivariate normality of the data, while ULS, WLS, and DWLS do not. Although the ML method with covariance structure analysis alone can also be based on the Wishart distribution of the sample covariance matrix, for convenience GLS, ML, and FIML are usually classified as normal-theory based methods, while ULS, WLS, and DWLS are usually classified as distribution-free methods.

An intuitive or even naive notion is usually that methods without distributional assumptions such as WLS and DWLS are preferred to normal theory methods such as ML and GLS in practical situations where multivariate normality is doubt. This notion might need some qualifications because there are simply more factors to consider in judging the quality of estimation methods in practice. For example, the WLS method might need a very large sample size to enjoy its purported asymptotic properties, while the ML might be robust against the violation of multi-normality assumption under certain circumstances. No recommendations regarding which estimation criterion should be used are attempted here, but you should make your choice based more than the assumption of multivariate normality.

Contribution of the Off-Diagonal Elements to the Estimation of Covariance or Correlation Structures

If only the covariance or correlation structures are considered, the six estimation functions, $F_{\mathit{ULS}}$ , $F_{\mathit{GLS}}$ , $F_{\mathit{ML}}$ , $F_{\mathit{FIML}}$ , $F_{\mathit{\mathit{WLS}}}$ , and $F_{\mathit{DWLS}}$ , belong to the following two groups:

The functions $F_{\mathit{ULS}}$ , $F_{\mathit{GLS}}$ , $F_{\mathit{ML}}$ , and $F_{\mathit{FIML}}$ take into account all $n^2$ elements of the symmetric residual matrix $\mb {S}-\bSigma$ . This means that the off-diagonal residuals contribute twice to the discrepancy function F, as lower and as upper triangle elements.
The functions $F_{\mathit{WLS}}$ and $F_{\mathit{DWLS}}$ take into account only the $n(n+1)/2$ lower triangular elements of the symmetric residual matrix $\mb {S}-\bSigma$ . This means that the off-diagonal residuals contribute to the discrepancy function F only once.

The $F_{\mathit{DWLS}}$ function used in PROC CALIS differs from that used by the LISREL 7 program. Formula (1.25) of the LISREL 7 manual (Jöreskog and Sörbom, 1985, p. 23) shows that LISREL groups the $F_{\mathit{DWLS}}$ function in the first group by taking into account all $n^2$ elements of the symmetric residual matrix $\mb {S}-\bSigma$ .

Relationship between DWLS and WLS: PROC CALIS: The $F_{\mathit{DWLS}}$ and $F_{\mathit{WLS}}$ discrepancy functions deliver the same results for the special case that the weight matrix $\mb {W}=\mb {W}_{ss}$ used by WLS estimation is a diagonal matrix. LISREL 7: This is not the case.
Relationship between DWLS and ULS: LISREL 7: The $F_{\mathit{DWLS}}$ and $F_{\mathit{ULS}}$ estimation functions deliver the same results for the special case that the diagonal weight matrix $\mb {W}=\mb {W}_{ss}$ used by DWLS estimation is an identity matrix. PROC CALIS: To obtain the same results with $F_{\mathit{DWLS}}$ and $F_{\mathit{ULS}}$ estimation, set the diagonal weight matrix $\mb {W}=\mb {W}_{ss}$ used in DWLS estimation to:

$[\mb {W}_{ss}]_{ik,ik} = \left\{ \begin{array}{llll} 1. & \mbox{if $i = k$} & & \\ 0.5 & \mbox{otherwise} & & \mbox{($k \le i$)} \end{array} \right.$

Because the reciprocal elements of the weight matrix are used in the discrepancy function, the off-diagonal residuals are weighted by a factor of 2.

ML and FIML Methods

Both the ML and FIML methods can be derived from the log-likelihood function for multivariate normal data. The preceding section Estimation Criteria mentions that $F_{\mathit{FIML}}$ is essentially the same as $\frac{-2L}{N}$ , where L is the log-likelihood function for multivariate normal data. For the ML estimation, you can also consider $\frac{-2L}{N}$ as a part of the $F_{\mathit{ML}}$ discrepancy function that contains the information regarding the model parameters (while the rest the $F_{\mathit{ML}}$ function contains some constant terms given the data). That is, with some algebraic manipulations and assuming that there is no missing value in the analysis (so that all $\bmu _ j$ and $\bSigma _ j$ are the same as $\bmu$ and $\bSigma$ , respectively), it can shown that

$\begin{eqnarray*} F_{\mathit{FIML}} & = & \frac{-2L}{N} \\ & = & \frac{1}{N}\sum _{j=1}^ n (\ln (|\bSigma |) + (\mb {x}_ j - \bmu )^{\prime }\bSigma ^{-1}(\mb {x}_ j - \bmu ) + K)\\ & = & \ln (|\bSigma |) + \mr {Tr}(\mb {S_ N} \bSigma ^{-1}) + (\mb {\bar{x}} - \bmu )^{\prime }\bSigma ^{-1}(\mb {\bar{x}} - \bmu ) + K \end{eqnarray*}$

where $\mb {\bar{x}}$ is the sample mean and $\mb {S_ N}$ is the biased sample covariance matrix. Compare this FIML function with the ML function shown in the following expression, which shows that both functions are very similar:

$F_{\mathit{ML}} = \ln (|\bSigma |) + \mr {Tr}(\mb {S} \bSigma ^{-1}) + (\mb {\bar{x}} - \bmu )^{\prime }\bSigma ^{-1}(\mb {\bar{x}} - \bmu ) - p - \ln (|\mb {S}|)$

The two expressions differ only in the constant terms, which are independent of the model parameters, and in the formulas for computing the sample covariance matrix. While the FIML method assumes the biased formula (with N as the divisor, by default) for the sample covariance matrix, the ML method (as implemented in PROC CALIS) uses the unbiased formula (with N – 1 as the divisor, by default).

The similarity (or dissimilarity) of the ML and FIML discrepancy functions leads to some useful conclusions here:

Because the constant terms in the discrepancy functions play no part in parameter estimation (except for shifting the function values), overriding the default ML method with VARDEF=N (that is, using N as the divisor in the covariance matrix formula) leads to the same estimation results as that of the FIML method, given that there are no missing values in the analysis.
Because the FIML function is evaluated at the level of individual observations, it is much more expensive to compute than the ML function. As compared with ML estimation, FIML estimation takes longer and uses more computing resources. Hence, for data without missing values, the ML method should always be chosen over the FIML method.
The advantage of the FIML method lies solely in its ability to handle data with random missing values. While the FIML method uses the information maximally from each observation, the ML method (as implemented in PROC CALIS) simply throws away any observations with at least one missing value. If it is important to use the information from observations with random missing values, the FIML method should be given consideration over the ML method.

See Example 29.15 for an application of the FIML method and Example 29.16 for an empirical comparison of the ML and FIML methods.