The MI Procedure

Example 75.13 Transforming to Normality

This example applies the MCMC method to the Fitness1 data set in which the variable Oxygen is transformed. Assume that Oxygen is skewed and can be transformed to normality with a logarithmic transformation. The following statements invoke the MI procedure and specify the transformation. The TRANSFORM statement specifies the log transformation for Oxygen. Note that the values displayed for Oxygen in all of the results correspond to transformed values.

proc mi data=Fitness1 seed=32937921
        nimpute=pctmissing mu0=50 10 180
        out=outex13;
   transform log(Oxygen);
   mcmc chain=multiple displayinit;
   var Oxygen RunTime RunPulse;
run;

The NIMPUTE=PCTMISSING option uses the percentage of the incomplete cases as the number of imputations.

The "Model Information"  table in Output 75.13.1 describes the method and options used in the multiple imputation process.

Output 75.13.1: Model Information

The MI Procedure

Model Information
Data Set WORK.FITNESS1
Method MCMC
Multiple Imputation Chain Multiple Chains
Initial Estimates for MCMC EM Posterior Mode
Start Starting Value
Prior Jeffreys
Number of Imputations 33
Number of Burn-in Iterations 200
Seed for random number generator 32937921



The "Missing Data Patterns" table in Output 75.13.2 lists distinct missing data patterns with corresponding statistics for the Fitness1 data. Note that the values of Oxygen shown in the tables are transformed values.

Output 75.13.2: Missing Data Patterns

Missing Data Patterns
Group Oxygen RunTime RunPulse Freq Percent Group Means
Oxygen RunTime RunPulse
1 X X X 21 67.74 3.829760 10.809524 171.666667
2 X X . 4 12.90 3.851813 10.137500 .
3 X . . 3 9.68 3.955298 . .
4 . X X 1 3.23 . 11.950000 176.000000
5 . X . 2 6.45 . 9.885000 .
Transformed Variables: Oxygen



For the NIMPUTE=PCTMISSING option, the percentage of the incomplete cases, 10/31 = 32.3%, is used as the number of imputations. Thus, 33 imputations (after rounding up) are generated.

The "Variable Transformations" table in Output 75.13.3 lists the variables that have been transformed.

Output 75.13.3: Variable Transformations

Variable Transformations
Variable _Transform_
Oxygen LOG



The "Initial Parameter Estimates for MCMC" table in Output 75.13.4 displays the starting mean and covariance estimates used in the MCMC method.

Output 75.13.4: Initial Parameter Estimates

Initial Parameter Estimates for MCMC
_TYPE_ _NAME_ Oxygen RunTime RunPulse
MEAN   3.846122 10.557605 171.382949
COV Oxygen 0.010827 -0.120891 -0.328772
COV RunTime -0.120891 1.744580 3.011180
COV RunPulse -0.328772 3.011180 82.747609
Transformed Variables: Oxygen



Output 75.13.5 displays variance information from the multiple imputation.

Output 75.13.5: Variance Information

Variance Information (33 Imputations)
  Variable Variance DF Relative
Increase
in Variance
Fraction
Missing
Information
Relative
Efficiency
Between Within Total
* Oxygen 0.000008582 0.000407 0.000416 27.572 0.021732 0.021297 0.999355
  RunTime 0.002396 0.068237 0.070706 27.17 0.036179 0.034989 0.998941
  RunPulse 0.862937 3.230086 4.119173 21.41 0.275252 0.218114 0.993434
* Transformed Variables



Output 75.13.6 displays parameter estimates from the multiple imputation. Note that the parameter value of ${\mu }_{0}$ has also been transformed using the logarithmic transformation.

Output 75.13.6: Parameter Estimates

Parameter Estimates (33 Imputations)
  Variable Mean Std Error 95% Confidence Limits DF Minimum Maximum Mu0 t for H0:
Mean=Mu0
Pr > |t|
* Oxygen 3.846347 0.020388 3.8046 3.8881 27.572 3.838599 3.851483 3.912023 -3.22 0.0033
  RunTime 10.543129 0.265905 9.9977 11.0886 27.17 10.452018 10.633302 10.000000 2.04 0.0509
  RunPulse 171.648705 2.029575 167.4329 175.8645 21.41 169.858210 173.316307 180.000000 -4.11 0.0005
* Transformed Variables



The following statements list the first 10 observations of the data set Outex13 in Output 75.13.7. Note that the values for Oxygen are in the original scale.

proc print data=outex13(obs=10);
   title 'First 10 Observations of the Imputed Data Set';
run;

Output 75.13.7: Imputed Data Set in Original Scale

First 10 Observations of the Imputed Data Set

Obs _Imputation_ Oxygen RunTime RunPulse
1 1 44.6090 11.3700 178.000
2 1 45.3130 10.0700 185.000
3 1 54.2970 8.6500 156.000
4 1 59.5710 7.1440 167.012
5 1 49.8740 9.2200 170.092
6 1 44.8110 11.6300 176.000
7 1 38.5834 11.9500 176.000
8 1 43.7376 10.8500 158.851
9 1 39.4420 13.0800 174.000
10 1 60.0550 8.6300 170.000



Note that the results in Output 75.13.7 can also be produced from the following statements without using a TRANSFORM statement. A transformed value of log(50)=3.91202 is used in the MU0= option.

data temp;
   set Fitness1;
   LogOxygen= log(Oxygen);
run;
proc mi data=temp seed=14337921 mu0=3.91202 10 180 out=outtemp;
   mcmc chain=multiple displayinit;
   var LogOxygen RunTime RunPulse;
run;
data outex13;
   set outtemp;
   Oxygen= exp(LogOxygen);
run;