The MI Procedure

 

Example 56.14 Multistage Imputation

This example uses two separate imputation procedures to complete the imputation process. In the first case, the MI procedure statements use the MCMC method to impute just enough missing values for a data set with an arbitrary missing pattern so that each imputed data set has a monotone missing pattern. In the second case, the MI procedure statements use a MONOTONE statement to impute missing values for data sets with monotone missing patterns.

The following statements are identical to those in Example 56.10. The statements invoke the MI procedure and specify the IMPUTE=MONOTONE option to create the imputed data set with a monotone missing pattern.

proc mi data=Fitness1 seed=17655417 out=outex14;
   mcmc impute=monotone;
   var Oxygen RunTime RunPulse;
run;

The "Missing Data Patterns" table in Output 56.14.1 lists distinct missing data patterns with corresponding statistics. Here, an "X" means that the variable is observed in the corresponding group, a "." means that the variable is missing and will be imputed to achieve the monotone missingness for the imputed data set, and an "O" means that the variable is missing and will not be imputed. The table also displays group-specific variable means.

Output 56.14.1 Missing Data Patterns
The MI Procedure

Missing Data Patterns
Group Oxygen RunTime RunPulse Freq Percent Group Means
Oxygen RunTime RunPulse
1 X X X 21 67.74 46.353810 10.809524 171.666667
2 X X O 4 12.90 47.109500 10.137500 .
3 X O O 3 9.68 52.461667 . .
4 . X X 1 3.23 . 11.950000 176.000000
5 . X O 2 6.45 . 9.885000 .

As shown in the table, the MI procedure needs to impute only three missing values from group 4 and group 5 to achieve a monotone missing pattern for the imputed data set. When the MCMC method is used to produce an imputed data set with a monotone missing pattern, tables of variance information and parameter estimates are not created.

The following statements impute one value for each missing value in the monotone missingness data set outex14:

proc mi data=outex14
        nimpute=1 seed=51343672
        out=outex14a;
   monotone reg;
   var Oxygen RunTime RunPulse;
   by _Imputation_;
run;

You can then analyze these data sets by using other SAS procedures and combine these results by using the MIANALYZE procedure. Note that the VAR statement is required with a MONOTONE statement to provide the variable order for the monotone missing pattern.

The "Model Information"  table in Output 56.14.2 shows that a monotone method is used to generate imputed values in the first BY group.

Output 56.14.2 Model Information
The MI Procedure

Model Information
Data Set WORK.OUTEX14
Method Monotone
Number of Imputations 1
Seed for random number generator 51343672

The "Monotone Model Specification"  table in Output 56.14.3 describes methods and imputed variables in the imputation model. The MI procedure uses the regression method to impute the variables RunTime and RunPulse in the model.

Output 56.14.3 Monotone Model Specification

Monotone Model Specification
Method Imputed Variables
Regression RunTime RunPulse

The "Missing Data Patterns" table in Output 56.14.4 lists distinct missing data patterns with corresponding statistics. It shows a monotone missing pattern for the imputed data set.

Output 56.14.4 Missing Data Patterns

Missing Data Patterns
Group Oxygen RunTime RunPulse Freq Percent Group Means
Oxygen RunTime RunPulse
1 X X X 22 70.97 46.057479 10.861364 171.863636
2 X X . 6 19.35 46.745227 10.053333 .
3 X . . 3 9.68 52.461667 . .

The following statements list the first 10 observations of the data set outex14a in Output 56.14.5:

proc print data=outex14a(obs=10);
   title 'First 10 Observations of the Imputed Data Set';
run;

Output 56.14.5 Imputed Data Set
First 10 Observations of the Imputed Data Set

Obs _Imputation_ Oxygen RunTime RunPulse
1 1 44.6090 11.3700 178.000
2 1 45.3130 10.0700 185.000
3 1 54.2970 8.6500 156.000
4 1 59.5710 7.1569 169.914
5 1 49.8740 9.2200 159.315
6 1 44.8110 11.6300 176.000
7 1 39.8345 11.9500 176.000
8 1 45.3196 10.8500 151.252
9 1 39.4420 13.0800 174.000
10 1 60.0550 8.6300 170.000

This example presents an alternative to the full-data MCMC imputation, in which imputation of only a few missing values is needed to achieve a monotone missing pattern for the imputed data set. The example uses a monotone MCMC method that imputes fewer missing values in each iteration and achieves approximate stationarity in fewer iterations (Schafer 1997, p. 227). The example also demonstrates how to combine the monotone MCMC method with a method for monotone missing data, which does not rely on iterations of steps.