The MI Procedure

Descriptive Statistics

Suppose $\mb {Y} = ( \mb {y}_{1}, \mb {y}_{2}, \ldots , \mb {y}_{n} )^{}$ is the $(n{\times }p)$ matrix of complete data, which might not be fully observed, $n_{0}$ is the number of observations fully observed, and $n_{j}$ is the number of observations with observed values for variable $Y_{j}$.

With complete cases, the sample mean vector is

\[  \overline{\mb {y}} = \frac{1}{n_{0}} \sum {\mb {y}_{i}}  \]

and the CSSCP matrix is

\[  \sum { ( \mb {y}_{i} - \overline{\mb {y}} ) ( \mb {y}_{i} - \overline{\mb {y}} )^{} }  \]

where each summation is over the fully observed observations.

The sample covariance matrix is

\[  \mb {S} = \frac{1}{\,  n_{0}-1 \, } \sum { ( \mb {y}_{i} - \overline{\mb {y}} ) ( \mb {y}_{i} - \overline{\mb {y}} )^{} }  \]

and is an unbiased estimate of the covariance matrix.

The correlation matrix $\mb {R}$, which contains the Pearson product-moment correlations of the variables, is derived by scaling the corresponding covariance matrix:

\[  \mb {R} = \mb {D}^{-1} \mb {S} \,  \mb {D}^{-1}  \]

where $\mb {D}$ is a diagonal matrix whose diagonal elements are the square roots of the diagonal elements of $\mb {S}$.

With available cases, the corrected sum of squares for variable $Y_{j}$ is

\[  \sum { ( y_{ji} - {\overline y_{j}} )^2 }  \]

where ${\overline y_{j}} = \frac{1}{n_{j}} \sum {y_{ji}}$ is the sample mean and each summation is over observations with observed values for variable $Y_{j}$.

The variance is

\[  s_{jj}^2 = \frac{1}{\,  n_{j}-1 \, } \sum { ( y_{ji} - {\overline y_{j}} )^2 }  \]

The correlations for available cases contain pairwise correlations for each pair of variables. Each correlation is computed from all observations that have nonmissing values for the corresponding pair of variables.