The MI Procedure

Combining Inferences from Multiply Imputed Data Sets

With m imputations, m different sets of the point and variance estimates for a parameter Q can be computed. Suppose $\hat{Q_ i}$ and $\hat{W_ i}$ are the point and variance estimates from the ith imputed data set, i = 1, 2, …, m. Then the combined point estimate for Q from multiple imputation is the average of the m complete-data estimates:

$\overline{Q} = \frac{1}{m} \sum _{i=1}^{m} \hat{Q_ i}$

Suppose $\overline{W}$ is the within-imputation variance, which is the average of the m complete-data estimates,

$\overline{W} = \frac{1}{m} \sum _{i=1}^{m} \hat{W_ i}$

and B is the between-imputation variance

$B = \frac{1}{m-1} \sum _{i=1}^{m} (\hat{Q_ i}-\overline{Q})^2$

Then the variance estimate associated with ${\overline Q}$ is the total variance (Rubin 1987)

$T = \overline{W} + (1+\frac{1}{m}) B$

The statistic $(Q-\overline{Q}) T^{-(1/2)}$ is approximately distributed as t with $v_{m}$ degrees of freedom (Rubin 1987), where

$v_{m} = (m-1) {\left[ 1 + \frac{\overline{W}}{(1+m^{-1})B} \right]}^2$

The degrees of freedom $v_{m}$ depend on m and the ratio

$r = \frac{(1+m^{-1})B}{\overline{W}}$

The ratio r is called the relative increase in variance due to nonresponse (Rubin 1987). When there is no missing information about Q, the values of r and B are both zero. With a large value of m or a small value of r, the degrees of freedom $v_{m}$ will be large and the distribution of $(Q-\overline{Q}) T^{-(1/2)}$ will be approximately normal.

Another useful statistic is the fraction of missing information about Q:

$\hat{\lambda } = \frac{r+2/(v_{m}+3)}{r+1}$

Both statistics r and $\lambda$ are helpful diagnostics for assessing how the missing data contribute to the uncertainty about Q.

When the complete-data degrees of freedom $v_{0}$ are small, and there is only a modest proportion of missing data, the computed degrees of freedom, $v_{m}$ , can be much larger than $v_{0}$ , which is inappropriate. For example, with m = 5 and r = 10%, the computed degrees of freedom $v_{m}=484$ , which is inappropriate for data sets with complete-data degrees of freedom less than 484.

Barnard and Rubin (1999) recommend the use of adjusted degrees of freedom

$v_{m}^{*} = \, \left[ \frac{1}{v_{m}} + \frac{1}{\hat{v}_{\mathit{obs}}} \right] ^{-1}$

where $\hat{v}_{\mathit{obs}} = (1 - \gamma ) \, v_{0} (v_{0}+1) / (v_{0}+3)$ and $\gamma = (1+m^{-1}) B / T$ .

Note that the MI procedure uses the adjusted degrees of freedom, $v_{m}^{*}$ , for inference.