Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The MI Procedure

EM Algorithm for Data with Missing Values

The EM algorithm (Dempster, Laird, and Rubin 1977) is a technique that finds maximum likelihood estimates in parametric models for incomplete data. The books by Little and Rubin (1987), Schafer (1997), and McLachlan and Krishnan (1997) provide detailed description and applications of the EM algorithm.

The EM algorithm is an iterative procedure that finds the MLE of the parameter vector by repeating the following steps:

1. The expectation E-step: Given a set of parameter estimates, such as a mean vector and covariance matrix for a multivariate normal distribution, the E-step calculates the conditional expectation of the complete-data log likelihood given the observed data and the parameter estimates.

2. The maximization M-step: Given a complete-data log likelihood, the M-step finds the parameter estimates to maximize the complete-data log likelihood from the E-step.

The two steps are iterated until the iterations converge.

In the EM process, the observed-data log likelihood is non-decreasing at each iteration. For multivariate normal data, suppose there are G groups with distinct missing patterns. Then the observed-data log likelihood being maximized can be expressed as

{\rm ln} \, L({\theta}| Y_{obs})=\sum_{g=1}^G {{\rm ln} \, L_{g}({\theta}| Y_{obs})}

where {\rm ln} \, L_{g}({\theta}| Y_{obs}) is the observed-data log likelihood from the gth group, and

{\rm ln} \, L_{g}({\theta}| Y_{obs})=- {\frac {n_{g}}2} \, {\rm ln} \, | {\Si... ...} \, \sum_{ig}{ (y_{ig} - {mu}_{g})' {{\Sigma}_{g}}^{-1} (y_{ig} - {mu}_{g}) }

where ng is the number of observations in the gth group, the summation is over observations in the gth group, yig is a vector of observed values corresponding to observed variables, mu_{g} is the corresponding mean vector, and {\Sigma}_{g} is the associated covariance matrix.

Refer to Schafer (1997, pp. 163 -181) for a detailed description of the EM algorithm for multivariate normal data.

PROC MI uses the means and standard deviations from available cases as the initial estimates for the EM algorithm. The correlations are set to zero. For a discussion of suggested starting values for the algorithm, see Schafer (1997, p. 169).

You can specify the convergence criterion with the CONVERGE= option in the EM statement. The iterations are considered to have converged when the maximum change in the parameter estimates between iteration steps is less than the value specified. You can also specify the maximum number of iterations used in the EM algorithm with the MAXITER= option.

The MI procedure displays tables of the initial parameter estimates used to begin the EM process and the MLE parameter estimates derived from EM. You can also display the EM iteration history with the option ITPRINT. PROC MI lists the iteration number, the likelihood -2 Log L, and parameter values mu at each iteration. You can also save the MLE derived from the EM algorithm in a SAS data set specified with the OUTEM= option.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 2001 by SAS Institute Inc., Cary, NC, USA. All rights reserved.