Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The MI Procedure

Combining Inferences from Multiply Imputed Data Sets

With m imputations, m different sets of the point and variance estimates for a parameter Q can be computed. Suppose \hat{Q_i} and \hat{U_i} are the point and variance estimates from the ith imputed data set, i=1, 2, ..., m. Then the combined point estimate for Q from multiple imputation is the average of the m complete-data estimates:

{\overline Q}=\frac{1}m \sum_{i=1}^m \hat{Q_i}

Suppose {\overline U} is the within-imputation variance, which is the average of the m complete-data estimates:

{\overline U}=\frac{1}m \sum_{i=1}^m \hat{U_i}

and B is the between-imputation variance

B=\frac{1}{m-1} \sum_{i=1}^m (\hat{Q_i}-{\overline Q})^2

Then the variance estimate associated with {\overline Q}is the total variance (Rubin 1987)

T={\overline U} + (1+\frac{1}m)B

The statistic (Q-{\overline Q}) T^{-(1/2)} is approximately distributed as t with vm degrees of freedom (Rubin 1987), where

v_{m}=(m-1) [1 + \frac{{\overline U}}{(1+m^{-1})B} ]^2

When the complete-data degrees of freedom v0 is small, and there is only a modest proportion of missing data, the computed degrees of freedom, vm, can be much larger than v0, which is inappropriate. Barnard and Rubin (1999) recommend the use of an adjusted degrees of freedom

v_{m}^{*}=\, [ \frac{1}{v_{m}} + \frac{1}{\hat{v}_{obs}} ] ^{-1}

where \hat{v}_{obs}=(1 - \gamma) \, v_{0} (v_{0}+1) / (v_{0}+3) and \gamma=(1+m^{-1}) B / T.

Note that the MI procedure uses the adjusted degrees of freedom, vm*, for inference.

The degrees of freedom vm depends on m and the ratio

r=\frac{(1+m^{-1})B}{\overline U}

The ratio r is called the relative increase in variance due to nonresponse (Rubin 1987). When there is no missing information about Q, the values of r and B are both zero. With a large value of m or a small value of r, the degrees of freedom v will be large and the distribution of (Q-{\overline Q}) T^{-(1/2)} will be approximately normal.

Another useful statistic is the fraction of missing information about Q:

\hat{\lambda}=\frac{r+2/(v+3)}{r+1}

Both statistics r and \lambda are helpful diagnostics for assessing how the missing data contribute to the uncertainty about Q.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 2001 by SAS Institute Inc., Cary, NC, USA. All rights reserved.