The MI Procedure

Example 75.6 FCS Methods for Continuous Variables

This example uses FCS regression methods to impute values for all continuous variables in a data set that has an arbitrary missing pattern.

The following statements invoke the MI procedure and impute missing values for the Fitness1 data set:

proc mi data=Fitness1 seed=1213 nimpute=pctmissing(min=5 max=20)
        mu0=50 10 180 out=outex6;
   fcs nbiter=20 reg(Oxygen/details);
   var Oxygen RunTime RunPulse;
run;

The NIMPUTE=PCTMISSING option uses the percentage of the incomplete cases as the number of imputations. The MIN=5 (which is the default) and MAX=20 options restrict the number of imputations to be in the range of 5 to 20. That is, 5 imputations are generated if the percentage of the incomplete cases is less than 5, and 20 imputations are generated if this percentage is greater than 20.

The FCS statement requests multivariate imputations by FCS methods, and the NBITER=20 option (which is the default) specifies the number of burn-in iterations before each imputation.

The "Model Information"  table in Output 75.6.1 describes the method and options used in the multiple imputation process.

Output 75.6.1: Model Information

The MI Procedure

Model Information
Data Set WORK.FITNESS1
Method FCS
Number of Imputations 20
Number of Burn-in Iterations 20
Seed for random number generator 1213



The "FCS Model Specification"  table in Output 75.6.2 describes methods and imputed variables in the imputation model. With the REG(OXYGEN) option in the FCS statement, the procedure uses the regression method to impute variable Oxygen. By default, the regression method is also used to impute variables RunTime and RunPulse.

Output 75.6.2: FCS Model Specification

FCS Model Specification
Method Imputed Variables
Regression Oxygen RunTime RunPulse



The "Missing Data Patterns" table in Output 75.6.3 lists distinct missing data patterns with corresponding frequencies and percentages.

Output 75.6.3: Missing Data Patterns

Missing Data Patterns
Group Oxygen RunTime RunPulse Freq Percent Group Means
Oxygen RunTime RunPulse
1 X X X 21 67.74 46.353810 10.809524 171.666667
2 X X . 4 12.90 47.109500 10.137500 .
3 X . . 3 9.68 52.461667 . .
4 . X X 1 3.23 . 11.950000 176.000000
5 . X . 2 6.45 . 9.885000 .



For the NIMPUTE=PCTMISSING option, the percentage of the incomplete cases, 10/31 = 32.3%, is used as the number of imputations. But the number 33 (after rounding up) is greater than 20 (as specified in the MAX= option), so only 20 imputations are generated.

When you specify the DETAILS option in REG(OXYGEN/DETAILS), the parameters that are used in each imputation for Oxygen are displayed in Output 75.6.4.

Output 75.6.4: FCS Regression Model for Oxygen

Regression Models for FCS Method
Imputed
Variable
Effect Imputation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Oxygen Intercept -0.132359 0.093555 0.078587 0.063256 -0.073869 -0.070292 -0.242377 -0.176468 -0.105706 -0.100698 -0.046309 -0.267186 0.074579 -0.121640 -0.041454 -0.085027 -0.025642 -0.016753 -0.148376 -0.165002
Oxygen RunTime -0.908663 -0.753423 -1.125549 -0.634844 -0.569809 -0.797221 -0.498457 -0.922488 -0.790878 -0.748476 -0.833819 -0.745716 -0.612349 -0.747333 -0.744806 -0.864134 -0.720153 -0.615441 -0.658032 -0.774503
Oxygen RunPulse -0.134745 0.052640 -0.135864 -0.158692 -0.319878 -0.277367 -0.510742 -0.035716 -0.169551 -0.086702 -0.158535 -0.006667 -0.175998 0.030089 -0.135503 -0.120457 -0.213634 -0.065866 -0.227149 -0.041462



The following statements list the first 10 observations of the data set Outex6 in Output 75.6.5. Note that all missing values of all variables are imputed.

proc print data=outex6(obs=10);
   title 'First 10 Observations of the Imputed Data Set';
run;

Output 75.6.5: Imputed Data Set

First 10 Observations of the Imputed Data Set

Obs _Imputation_ Oxygen RunTime RunPulse
1 1 44.6090 11.3700 178.000
2 1 45.3130 10.0700 185.000
3 1 54.2970 8.6500 156.000
4 1 59.5710 10.1985 185.842
5 1 49.8740 9.2200 173.379
6 1 44.8110 11.6300 176.000
7 1 44.6299 11.9500 176.000
8 1 47.4258 10.8500 183.926
9 1 39.4420 13.0800 174.000
10 1 60.0550 8.6300 170.000



After the completion of the specified four imputations, the "Variance Information" table in Output 75.6.6 displays the between-imputation variance, within-imputation variance, and total variance for combining complete-data inferences. The relative increase in variance due to missingness, the fraction of missing information, and the relative efficiency for each variable are also displayed. These statistics are described in the section Combining Inferences from Multiply Imputed Data Sets.

Output 75.6.6: Variance Information

Variance Information (20 Imputations)
Variable Variance DF Relative
Increase
in Variance
Fraction
Missing
Information
Relative
Efficiency
Between Within Total
Oxygen 0.026152 0.927386 0.954846 27.339 0.029610 0.028843 0.998560
RunTime 0.002608 0.067218 0.069957 27.02 0.040736 0.039296 0.998039
RunPulse 2.430764 3.542847 6.095149 14.23 0.720410 0.429183 0.978992



The "Parameter Estimates" table in Output 75.6.7 displays a 95% mean confidence interval and a t statistic with its associated p-value for each of the hypotheses requested with the MU0= option.

Output 75.6.7: Parameter Estimates

Parameter Estimates (20 Imputations)
Variable Mean Std Error 95% Confidence Limits DF Minimum Maximum Mu0 t for H0:
Mean=Mu0
Pr > |t|
Oxygen 47.080908 0.977162 45.0771 49.0847 27.339 46.758087 47.512585 50.000000 -2.99 0.0059
RunTime 10.564752 0.264493 10.0221 11.1074 27.02 10.511826 10.706529 10.000000 2.14 0.0420
RunPulse 171.412215 2.468836 166.1251 176.6993 14.23 168.633931 174.275234 180.000000 -3.48 0.0036