The CALIS Procedure

Measures of Multivariate Kurtosis

In many applications, the manifest variables are not even approximately multivariate normal. If this happens to be the case with your data set, the default generalized least squares and maximum likelihood estimation methods are not appropriate, and you should compute the parameter estimates and their standard errors by an asymptotically distribution-free method, such as the WLS estimation method. If your manifest variables are multivariate normal, then they have a zero relative multivariate kurtosis, and all marginal distributions have zero kurtosis (Browne, 1982). If your DATA= data set contains raw data, PROC CALIS computes univariate skewness and kurtosis and a set of multivariate kurtosis values. By default, the values of univariate skewness and kurtosis are corrected for bias (as in PROC UNIVARIATE), but using the BIASKUR option enables you to compute the uncorrected values also. The values are displayed when you specify the PROC CALIS statement option KURTOSIS .

In the following formulas, N denotes the sample size and p denotes the number of variables.

  • corrected variance for variable $z_ j$

    \[  \sigma _ j^2 = \frac{1}{N-1} {\sum _ i^ N(z_{ij} - \bar{z}_ j)^2}  \]

  • uncorrected univariate skewness for variable $z_ j$

    \[  \gamma _{1(j)} = \frac{ {N \sum _ i^ N (z_{ij} - \bar{z}_ j)^3}}{\sqrt {N [\sum _ i^ N (z_{ij} - \bar{z}_ j)^2]^3 } }  \]
  • corrected univariate skewness for variable $z_ j$

    \[  \gamma _{1(j)} = \frac{N}{(N - 1)(N - 2)} \frac{{\sum _ i^ N (z_{ij} - \bar{z}_ j)^3}}{\sigma _ j^3}  \]
  • uncorrected univariate kurtosis for variable $z_ j$

    \[  \gamma _{2(j)} = \frac{ N {\sum _ i^ N (z_{ij} - \bar{z}_ j)^4}}{[\sum _ i^ N (z_{ij} - \bar{z}_ j)^2]^2 } - 3  \]
  • corrected univariate kurtosis for variable $z_ j$

    \[  \gamma _{2(j)} = \frac{N(N + 1)}{(N - 1)(N - 2)(N - 3)} \frac{ {\sum _ i^ N (z_{ij} - \bar{z}_ j)^4}}{\sigma _ j^4} - \frac{3(N - 1)^2}{(N - 2)(N - 3)}  \]
  • Mardia’s multivariate kurtosis

    \[  \gamma _2 = {\frac{1}{N} {\sum _ i^ N[(z_ i - \bar{z})^{\prime } \mb{S}^{-1} (z_ i - \bar{z})]^2} - p(p + 2) }  \]

    where $\mb{S}$ is the biased sample covariance matrix with N as the divisor.

  • relative multivariate kurtosis

    \[  \eta _2 = \frac{ {\gamma _2 + p(p + 2)}}{ {p(p + 2)} }  \]
  • normalized multivariate kurtosis

    \[  \kappa _0 = \frac{ \gamma _2 }{\sqrt {8p(p + 2) / N} }  \]

  • Mardia based kappa

    \[  \kappa _1 = \frac{\gamma _2}{p(p + 2) }  \]
  • mean scaled univariate kurtosis

    \[  \kappa _2 = { \frac{1}{3p} {\sum _ j^ p \gamma _{2(j)}} }  \]
  • adjusted mean scaled univariate kurtosis

    \[  \kappa _3 = { \frac{1}{3p} {\sum _ j^ p \gamma ^*_{2(j)}} }  \]

    with

    \[  \gamma ^*_{2(j)} = \left\{  \begin{matrix}  \gamma _{2(j)} \quad ,   &  \quad \mbox{if} \quad \gamma _{2(j)} > \frac{-6}{p+2}   \\ \frac{-6}{p+2} \quad ,   &  \mbox{otherwise}   \\ \end{matrix} \right.  \]

If variable $Z_ j$ is normally distributed, the uncorrected univariate kurtosis $\gamma _{2(j)}$ is equal to 0. If Z has an p-variate normal distribution, Mardia’s multivariate kurtosis $\gamma _2$ is equal to 0. A variable $Z_ j$ is called leptokurtic if it has a positive value of $\gamma _{2(j)}$ and is called platykurtic if it has a negative value of $\gamma _{2(j)}$. The values of $\kappa _1$, $\kappa _2$, and $\kappa _3$ should not be smaller than the following lower bound (Bentler, 1985):

\[  \hat{\kappa } \geq \frac{-2}{p + 2}  \]

PROC CALIS displays a message if $\kappa _1$, $\kappa _2$, or $\kappa _3$ falls below the lower bound.

If weighted least squares estimates (METHOD= WLS or METHOD= ADF) are specified and the weight matrix is computed from an input raw data set, the CALIS procedure computes two more measures of multivariate kurtosis.

  • multivariate mean kappa

    \[  \kappa _4 = \frac{1}{m} {\sum _ i^ p \sum _ j^ i \sum _ k^ j \sum _ l^ k \hat{\kappa }_{ij,kl}} - 1 \quad  \]

    where

    \[  \hat{\kappa }_{ij,kl}=\frac{s_{ij,kl}}{ s_{ij}s_{kl} + s_{ik}s_{jl} + s_{il}s_{jk} }  \]

    and $m=p(p+1)(p+2)(p+3)/24$ is the number of elements in the vector $s_{ij,kl}$ (Bentler, 1985).

  • multivariate least squares kappa

    \[  \kappa _5 = \frac{s_4^{\prime } s_2}{s_2^{\prime } s_2} - 1  \]

    where $s_2$ is the vector of the elements in the denominator of $\hat{\kappa }$ (Bentler, 1985) and $s_4$ is the vector of the $s_{ij,kl}$, which is defined as

    \[  s_{ij,kl} = \frac{1}{N} \sum _{r=1}^ N{(z_{ri} - \bar{z}_ i)(z_{rj} - \bar{z}_ j) (z_{rk} - \bar{z}_ k)(z_{rl} - \bar{z}_ l)}  \]

The occurrence of significant nonzero values of Mardia’s multivariate kurtosis $\gamma _2$ and significant amounts of some of the univariate kurtosis values $\gamma _{2(j)}$ indicate that your variables are not multivariate normal distributed. Violating the multivariate normality assumption in (default) generalized least squares and maximum likelihood estimation usually leads to the wrong approximate standard errors and incorrect fit statistics based on the $\chi ^2$ value. In general, the parameter estimates are more stable against violation of the normal distribution assumption. For more details, see Browne (1974, 1982, 1984).