Example 9.1: EM Algorithm for MLE
This example uses the EM algorithm to compute the maximum likelihood estimates
for the parameters of a multivariate normal distribution
using data with missing values.
The following statements invoke the MI procedure
and request the EM algorithm to compute the MLE
for
of a multivariate normal distribution
from the input data set FitMiss.
proc mi data=FitMiss seed=55417 simple nimpute=0;
em itprint outem=outem;
var Oxygen RunTime RunPulse;
run;
Note when you specify the option NIMPUTE=0, the missing values will not
be imputed. The procedure generates the following output:
Output 9.1.1: Model Information
|
| Model Information |
| Data Set |
WORK.FITMISS |
| Method |
MCMC |
| Multiple Imputation Chain |
Single Chain |
| Initial Estimates for MCMC |
EM Posterior Mode |
| Start |
Starting Value |
| Prior |
Jeffreys |
| Number of Imputations |
0 |
| Number of Burn-in Iterations |
200 |
| Number of Iterations |
100 |
| Seed for random number generator |
55417 |
|
The "Model Information" table describes the method and
options used in the procedure.
Output 9.1.2: Missing Data Patterns
|
| Missing Data Patterns |
| Group |
Oxygen |
RunTime |
RunPulse |
Freq |
Percent |
Group Means |
| Oxygen |
RunTime |
RunPulse |
| 1 |
X |
X |
X |
21 |
67.74 |
46.353810 |
10.809524 |
171.666667 |
| 2 |
X |
X |
. |
4 |
12.90 |
47.109500 |
10.137500 |
. |
| 3 |
X |
. |
. |
3 |
9.68 |
52.461667 |
. |
. |
| 4 |
. |
X |
X |
1 |
3.23 |
. |
11.950000 |
176.000000 |
| 5 |
. |
X |
. |
2 |
6.45 |
. |
9.885000 |
. |
|
The "Missing Data Patterns" table lists distinct missing
data patterns with corresponding frequencies and percents.
Here, "X" means that the variable is observed in the corresponding
group and "." means that the variable is missing.
The table also displays group-specific variable means.
With the SIMPLE option, the procedure
displays simple descriptive univariate statistics for available cases
in the "Univariate Statistics" table
and correlations from pairwise available cases
in the "Pairwise Correlations" table.
Output 9.1.3: Univariate Statistics
|
| Univariate Statistics |
| Variable |
N |
Mean |
Std Dev |
Minimum |
Maximum |
| Oxygen |
28 |
47.11618 |
5.41305 |
37.38800 |
60.05500 |
| RunTime |
28 |
10.68821 |
1.37988 |
8.63000 |
14.03000 |
| RunPulse |
22 |
171.86364 |
10.14324 |
148.00000 |
186.00000 |
|
Output 9.1.4: Pairwise Correlations
|
| Pairwise Correlations |
| |
Oxygen |
RunTime |
RunPulse |
| Oxygen |
1.000000000 |
-0.849118562 |
-0.343961742 |
| RunTime |
-0.849118562 |
1.000000000 |
0.247258191 |
| RunPulse |
-0.343961742 |
0.247258191 |
1.000000000 |
|
With the EM statement, the procedure displays
the initial parameter estimates for EM.
Output 9.1.5: Initial Parameter Estimates for EM
|
| Initial Parameter Estimates for EM |
| _TYPE_ |
_NAME_ |
Oxygen |
RunTime |
RunPulse |
| MEAN |
|
47.116179 |
10.688214 |
171.863636 |
| COV |
Oxygen |
29.301078 |
0 |
0 |
| COV |
RunTime |
0 |
1.904067 |
0 |
| COV |
RunPulse |
0 |
0 |
102.885281 |
|
With the ITPRINT option, the "EM (MLE) Iteration History" table
displays the iteration history for the EM algorithm.
Output 9.1.6: EM (MLE) Iteration History
|
| EM (MLE) Iteration History |
| _Iteration_ |
-2 Log L |
Oxygen |
RunTime |
RunPulse |
| 0 |
289.544782 |
47.116179 |
10.688214 |
171.863636 |
| 1 |
263.549489 |
47.116179 |
10.688214 |
171.863636 |
| 2 |
255.851312 |
47.139089 |
10.603506 |
171.538203 |
| 3 |
254.616428 |
47.122353 |
10.571685 |
171.426790 |
| 4 |
254.494971 |
47.111080 |
10.560585 |
171.398296 |
| 5 |
254.483973 |
47.106523 |
10.556768 |
171.389208 |
| 6 |
254.482920 |
47.104899 |
10.555485 |
171.385257 |
| 7 |
254.482813 |
47.104348 |
10.555062 |
171.383345 |
| 8 |
254.482801 |
47.104165 |
10.554923 |
171.382424 |
| 9 |
254.482800 |
47.104105 |
10.554878 |
171.381992 |
| 10 |
254.482800 |
47.104086 |
10.554864 |
171.381796 |
|
The procedure then displays the EM (MLE) parameter estimates,
the maximum likelihood estimates for
and
of a multivariate normal distribution from the data set FitMiss.
Output 9.1.7: EM (MLE) Parameter Estimates
|
| EM (MLE) Parameter Estimates |
| _TYPE_ |
_NAME_ |
Oxygen |
RunTime |
RunPulse |
| MEAN |
|
47.104086 |
10.554864 |
171.381796 |
| COV |
Oxygen |
27.798014 |
-6.457929 |
-18.030790 |
| COV |
RunTime |
-6.457929 |
2.015491 |
3.516092 |
| COV |
RunPulse |
-18.030790 |
3.516092 |
97.766559 |
|
You can also output the EM (MLE) parameter estimates
into an output data set with the OUTEM= option.
The following statements list the observations
in the output data set outem.
proc print data=outem;
title 'EM Estimates';
run;
Output 9.1.8: EM Estimates
|
| Obs |
_TYPE_ |
_NAME_ |
Oxygen |
RunTime |
RunPulse |
| 1 |
MEAN |
|
47.1041 |
10.5549 |
171.382 |
| 2 |
COV |
Oxygen |
27.7980 |
-6.4579 |
-18.031 |
| 3 |
COV |
RunTime |
-6.4579 |
2.0155 |
3.516 |
| 4 |
COV |
RunPulse |
-18.0308 |
3.5161 |
97.767 |
|
The output data set outem is a TYPE=COV data set.
The observation with _TYPE_=`MEAN' contains
the MLE for the parameter
and
the observations with _TYPE_=`COV' contain
the MLE for the parameter
of a multivariate normal distribution from the data set FitMiss.
Copyright © 2001 by SAS Institute Inc., Cary, NC, USA. All rights reserved.