The MI Procedure

 

Example 56.7 FCS Method for CLASS Variables

This example uses FCS methods to impute missing values in both continuous and CLASS variables in a data set with an arbitrary missing pattern. The following statements invoke the MI procedure and impute missing values for the Fish3 data set:

proc mi data=Fish3 seed=1305417 out=outex7;
   class Species;
   fcs nbiter=5 discrim(Species/details) reg(Height/details);
   var Species Length Height Width;
run;

The DISCRIM option uses the discriminant function method to impute the classification variable Species, and the REG option uses the regression method to impute the continuous variable Height. By default, the regression method is also used to impute other continuous variables, Length and Width.

The "Model Information"  table in Output 56.7.1 describes the method and options used in the multiple imputation process.

Output 56.7.1 Model Information
The MI Procedure

Model Information
Data Set WORK.FISH3
Method FCS
Number of Imputations 5
Number of Burn-in Iterations 5
Seed for random number generator 1305417

The "FCS Model Specification"  table in Output 56.7.2 describes methods and imputed variables in the imputation model. The procedure uses the discriminant function method to impute the variable Species, and the regression method to impute other variables.

Output 56.7.2 FCS Model Specification
FCS Model Specification
Method Imputed Variables
Regression Length Height Width
Discriminant Function Species

The "Missing Data Patterns" table in Output 56.7.3 lists distinct missing data patterns with corresponding frequencies and percentages. With the default ORDER=FREQ option, the variable ordering by the descending frequency counts is used for the missing values in the filled-in and imputation phases.

Output 56.7.3 Missing Data Patterns
Missing Data Patterns
Group Length Height Width Species Freq Percent Group Means
Length Height Width
1 X X X X 38 73.08 41.515789 12.531526 5.266474
2 X X X . 3 5.77 38.433333 11.797667 4.587667
3 X X . . 3 5.77 45.033333 13.647667 .
4 X . X . 2 3.85 36.100000 . 5.135000
5 X . . . 2 3.85 40.150000 . .
6 . X X X 2 3.85 . 14.448000 6.886000
7 . X . X 1 1.92 . 18.037000 .
8 . X . . 1 1.92 . 12.444000 .

With the specified DETAILS option for variables Species and Height, parameters used in each imputation for these two variables are displayed in the "Group Means for FCS Discriminant Method" table in Output 56.7.4 and in the "Regression Models for FCS Method" table in Output 56.7.5.

Output 56.7.4 FCS Discrim Model for Species
Group Means for FCS Discriminant Method
Species Variable Imputation
1 2 3 4 5
Bream Length -0.020460 -0.375046 -0.455147 -0.227513 -0.149084
Bream Height 0.693833 0.623187 0.744749 0.580846 0.714942
Bream Width 0.397506 0.173774 0.421867 0.167947 0.300103
Pike Length 0.845745 1.304043 0.708257 1.063104 0.382590
Pike Height -1.357333 -1.140244 -1.367343 -1.269584 -1.342550
Pike Width -0.341246 0.193092 -0.517978 -0.366050 -0.438790

Output 56.7.5 FCS Regression Model for Height
Regression Models for FCS Method
Imputed
Variable
Effect Species Imputation
1 2 3 4 5
Height Intercept   -0.341941 -0.366473 -0.315587 -0.361090 -0.324455
Height Length   0.119780 0.126889 0.011333 0.137968 0.117460
Height Width   0.350410 0.310695 0.441925 0.345254 0.317621
Height Species Bream 0.987346 1.008808 0.851794 0.999192 0.999200

The following statements list the first 10 observations of the data set outex7 in Output 56.7.6:

proc print data=outex7(obs=10);
   title 'First 10 Observations of the Imputed Data Set';
run;

Output 56.7.6 Imputed Data Set
First 10 Observations of the Imputed Data Set

Obs _Imputation_ Species Length Height Width
1 1 Bream 30.0000 11.5200 4.02000
2 1 Bream 31.2000 12.4800 4.30600
3 1 Bream 31.1000 12.3780 4.69600
4 1 Bream 33.5000 12.7300 4.45600
5 1 Bream 31.2895 12.4440 4.05416
6 1 Bream 34.7000 13.6020 4.92700
7 1 Bream 34.5000 14.1800 5.27900
8 1 Bream 35.0000 13.2992 4.69000
9 1 Bream 35.1000 14.0050 4.84400
10 1 Bream 36.2000 14.2270 4.95900

After the completion of five imputations by default, the "Variance Information" table in Output 56.7.7 displays the between-imputation variance, within-imputation variance, and total variance for combining complete-data inferences for continuous variables. The relative increase in variance due to missingness, the fraction of missing information, and the relative efficiency for each variable are also displayed. These statistics are described in the section Combining Inferences from Multiply Imputed Data Sets.

Output 56.7.7 Variance Information
Variance Information
Variable Variance DF Relative
Increase
in Variance
Fraction
Missing
Information
Relative
Efficiency
Between Within Total
Length 0.158766 1.287899 1.478418 36.33 0.147930 0.136011 0.973518
Height 0.007807 0.310949 0.320317 47.194 0.030127 0.029661 0.994103
Width 0.002160 0.016085 0.018677 35.138 0.161157 0.146966 0.971446

The "Parameter Estimates" table in Output 56.7.8 displays a 95% mean confidence interval and a t statistic with its associated p-value for each of the hypotheses requested with the default MU0=0 option.

Output 56.7.8 Parameter Estimates
Parameter Estimates
Variable Mean Std Error 95% Confidence Limits DF Minimum Maximum Mu0 t for H0:
Mean=Mu0
Pr > |t|
Length 41.858477 1.215902 39.39329 44.32366 36.33 41.511771 42.316960 0 34.43 <.0001
Height 12.724307 0.565966 11.58585 13.86276 47.194 12.622320 12.811756 0 22.48 <.0001
Width 5.344556 0.136663 5.06715 5.62196 35.138 5.290049 5.393757 0 39.11 <.0001