The MI Procedure

Descriptive Statistics

Suppose $\mb{Y} = ( \mb{y}_{1}, \mb{y}_{2}, \ldots , \mb{y}_{n} )^{'}$ is the $(n{\times }p)$ matrix of complete data, which might not be fully observed, $n_{0}$ is the number of observations fully observed, and $n_{j}$ is the number of observations with observed values for variable $Y_{j}$.

With complete cases, the sample mean vector is

\[ \overline{\mb{y}} = \frac{1}{n_{0}} \sum {\mb{y}_{i}} \]

and the CSSCP matrix is

\[ \sum { ( \mb{y}_{i} - \overline{\mb{y}} ) ( \mb{y}_{i} - \overline{\mb{y}} )^{'} } \]

where each summation is over the fully observed observations.

The sample covariance matrix is

\[ \mb{S} = \frac{1}{\, n_{0}-1 \, } \sum { ( \mb{y}_{i} - \overline{\mb{y}} ) ( \mb{y}_{i} - \overline{\mb{y}} )^{'} } \]

and is an unbiased estimate of the covariance matrix.

The correlation matrix $\mb{R}$, which contains the Pearson product-moment correlations of the variables, is derived by scaling the corresponding covariance matrix:

\[ \mb{R} = \mb{D}^{-1} \mb{S} \, \mb{D}^{-1} \]

where $\mb{D}$ is a diagonal matrix whose diagonal elements are the square roots of the diagonal elements of $\mb{S}$.

With available cases, the corrected sum of squares for variable $Y_{j}$ is

\[ \sum { ( y_{ji} - {\overline y_{j}} )^2 } \]

where ${\overline y_{j}} = \frac{1}{n_{j}} \sum {y_{ji}}$ is the sample mean and each summation is over observations with observed values for variable $Y_{j}$.

The variance is

\[ s_{jj}^2 = \frac{1}{\, n_{j}-1 \, } \sum { ( y_{ji} - {\overline y_{j}} )^2 } \]

The correlations for available cases contain pairwise correlations for each pair of variables. Each correlation is computed from all observations that have nonmissing values for the corresponding pair of variables.