Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The MIANALYZE Procedure

Getting Started

The Fitness data set has been altered to contain an arbitrary missing pattern:
  
*----------------- Data on Physical Fitness -----------------*
| These measurements were made on men involved in a physical |
| fitness course at N.C. State University.                   |
| Only selected variables of                                 |
| Oxygen (oxygen intake, ml per kg body weight per minute),  |
| Runtime (time to run 1.5 miles in minutes), and            |
| RunPulse (heart rate while running) are used.              |
| Certain values were changed to missing for the analysis.   |
*------------------------------------------------------------*;

  
   data FitMiss;
      input Oxygen RunTime RunPulse @@;
      datalines;
   44.609  11.37  178     45.313  10.07  185  
   54.297   8.65  156     59.571    .      .  
   49.874   9.22    .     44.811  11.63  176  
     .     11.95  176     49.091  10.85    .  
   39.442  13.08  174     60.055   8.63  170  
   50.541    .      .     37.388  14.03  186  
   44.754  11.12  176     47.273    .      .  
   51.855  10.33  166     49.156   8.95  180  
   40.836  10.95  168     46.672  10.00    .  
     .     10.25    .     50.388  10.08  168  
   39.407  12.63  174     46.080  11.17  156  
   45.441   9.63  164       .      8.92  146  
   45.118  11.08    .     39.203  12.88  168  
   45.790  10.47  186     50.545   9.93  148  
   48.673   9.40  186     47.920  11.50  170  
   47.467  10.50  170  
   ;

Assume that the data are multivariate normally distributed and that the missing data are missing at random (see the "Statistical Assumptions for Multiple Imputation" section in "The MI Procedure" chapter for a description of these assumptions). The following statements use the MI procedure to impute missing values for the FitMiss data set.

   proc mi data=FitMiss noprint out=outmi seed=37851;   
      var Oxygen RunTime RunPulse; 
   run;

The MI procedure creates imputed data sets, which are stored in the outmi data set. A variable named _Imputation_ indicates the imputation numbers. Based on m imputations, m different sets of the point and variance estimates for a parameter can be computed. In this example, m=5 is the default.

The following statements generate regression coefficients for each of the five imputed data sets:

   proc reg data=outmi outest=outreg covout noprint;
      model Oxygen= RunTime RunPulse;
      by _Imputation_;
   run; 

   proc print data=outreg(obs=8);
      var _Imputation_ _Type_ _Name_
         Intercept RunTime RunPulse;
      title 'Parameter Estimates from Imputed Data Sets';
   run;

 
Parameter Estimates from Imputed Data Sets

Obs _Imputation_ _TYPE_ _NAME_ Intercept RunTime RunPulse
1 1 PARMS   97.2874 -2.98892 -0.10684
2 1 COV Intercept 55.7516 -0.73348 -0.27870
3 1 COV RunTime -0.7335 0.15167 -0.00509
4 1 COV RunPulse -0.2787 -0.00509 0.00194
5 2 PARMS   90.9324 -2.93338 -0.07391
6 2 COV Intercept 37.5576 -0.25970 -0.20442
7 2 COV RunTime -0.2597 0.13978 -0.00722
8 2 COV RunPulse -0.2044 -0.00722 0.00166
Figure 10.1: Parameter Estimates

The following statements combine the five sets of regression coefficients:

   proc mianalyze data=outreg;
      var Intercept RunTime RunPulse;
   run;

 
The MIANALYZE Procedure

Model Information
Data Set WORK.OUTREG
Number of Imputations 5
 
Multiple Imputation Variance Information
Parameter Variance DF Relative
Increase
in Variance
Fraction
Missing
Information
Between Within Total
Intercept 74.179857 57.287519 146.303348 10.805 1.553843 0.665161
RunTime 0.034202 0.142151 0.183193 79.694 0.288719 0.242803
RunPulse 0.001533 0.002304 0.004144 20.292 0.798522 0.491731
Figure 10.2: Model Information and Variance Information Tables

The "Model Information" table lists the input data set(s) and the number of imputations. The "Multiple Imputation Variance Information" table displays the between-imputation, within-imputation, and total variances for combining complete-data inferences. It also displays the degrees of freedom for the total variance, the relative increase in variance due to missing values, and the fraction of missing information for each parameter estimate.

 
The MIANALYZE Procedure

Multiple Imputation Parameter Estimates
Parameter Estimate Std Error 95% Confidence Limits DF Minimum Maximum Theta0 t for H0:
Parameter=Theta0
Pr > |t|
Intercept 92.156840 12.095592 65.47596 118.8377 10.805 77.939497 99.920480 0 7.62 <.0001
RunTime -2.955317 0.428011 -3.80714 -2.1035 79.694 -3.159663 -2.660085 0 -6.90 <.0001
RunPulse -0.079851 0.064376 -0.21401 0.0543 20.292 -0.112277 -0.015111 0 -1.24 0.2290
Figure 10.3: Multiple Imputation Parameter Estimates

The "Multiple Imputation Parameter Estimates" table displays a combined estimate and standard error for each regression coefficient (parameter). Inferences are based on t distributions. The table displays a 95% confidence interval and a t-test with the associated p-value for the hypothesis that the parameter is equal to the value specified with the THETA0= option (in this case, zero by default). The minimum and maximum parameter estimates from the imputed data sets are also displayed.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 2001 by SAS Institute Inc., Cary, NC, USA. All rights reserved.