The MI Procedure

Combining Inferences from Multiply Imputed Data Sets

With m imputations, m different sets of the point and variance estimates for a parameter Q can be computed. Suppose $\hat{Q_ i}$ and $\hat{W_ i}$ are the point and variance estimates from the ith imputed data set, i = 1, 2, …, m. Then the combined point estimate for Q from multiple imputation is the average of the m complete-data estimates:

\[ \overline{Q} = \frac{1}{m} \sum _{i=1}^{m} \hat{Q_ i} \]

Suppose $\overline{W}$ is the within-imputation variance, which is the average of the m complete-data estimates,

\[ \overline{W} = \frac{1}{m} \sum _{i=1}^{m} \hat{W_ i} \]

and B is the between-imputation variance

\[ B = \frac{1}{m-1} \sum _{i=1}^{m} (\hat{Q_ i}-\overline{Q})^2 \]

Then the variance estimate associated with ${\overline Q}$ is the total variance (Rubin 1987)

\[ T = \overline{W} + (1+\frac{1}{m}) B \]

The statistic $(Q-\overline{Q}) T^{-(1/2)}$ is approximately distributed as t with $v_{m}$ degrees of freedom (Rubin 1987), where

\[ v_{m} = (m-1) {\left[ 1 + \frac{\overline{W}}{(1+m^{-1})B} \right]}^2 \]

The degrees of freedom $v_{m}$ depend on m and the ratio

\[ r = \frac{(1+m^{-1})B}{\overline{W}} \]

The ratio r is called the relative increase in variance due to nonresponse (Rubin 1987). When there is no missing information about Q, the values of r and B are both zero. With a large value of m or a small value of r, the degrees of freedom $v_{m}$ will be large and the distribution of $(Q-\overline{Q}) T^{-(1/2)}$ will be approximately normal.

Another useful statistic is the fraction of missing information about Q:

\[ \hat{\lambda } = \frac{r+2/(v_{m}+3)}{r+1} \]

Both statistics r and $\lambda $ are helpful diagnostics for assessing how the missing data contribute to the uncertainty about Q.

When the complete-data degrees of freedom $v_{0}$ are small, and there is only a modest proportion of missing data, the computed degrees of freedom, $v_{m}$, can be much larger than $v_{0}$, which is inappropriate. For example, with m = 5 and r = 10%, the computed degrees of freedom $v_{m}=484$, which is inappropriate for data sets with complete-data degrees of freedom less than 484.

Barnard and Rubin (1999) recommend the use of adjusted degrees of freedom

\[ v_{m}^{*} = \, \left[ \frac{1}{v_{m}} + \frac{1}{\hat{v}_{\mathit{obs}}} \right] ^{-1} \]

where   $\hat{v}_{\mathit{obs}} = (1 - \gamma ) \,  v_{0} (v_{0}+1) / (v_{0}+3)$   and   $\gamma = (1+m^{-1}) B / T$.

Note that the MI procedure uses the adjusted degrees of freedom, $v_{m}^{*}$, for inference.