The MI Procedure |
Logistic Regression Method for Monotone Missing Data |
The logistic regression method is another imputation method available for classification variables in a data set with a monotone missing pattern.
In the logistic regression method, a logistic regression model is fitted for a classification variable with a set of covariates constructed from the effects. For a binary classification variable, based on the fitted regression model, a new logistic regression model is simulated from the posterior predictive distribution of the parameters and is used to impute the missing values for each variable (Rubin 1987, pp. 169–170).
For a binary variable with responses 1 and 2, a logistic regression model is fitted using observations with observed values for the imputed variable and its covariates , , ..., :
where are covariates for , , and
The fitted model includes the regression parameter estimates and the associated covariance matrix .
The following steps are used to generate imputed values for a binary variable with responses 1 and 2:
New parameters are drawn from the posterior predictive distribution of the parameters.
where is the upper triangular matrix in the Cholesky decomposition, , and is a vector of independent random normal variates.
For an observation with missing and covariates , compute the expected probability that :
where .
Draw a random uniform variate, , between 0 and 1. If the value of is less than , impute ; otherwise impute .
The preceding logistic regression method can be extended to include the ordinal classification variables with more than two levels of responses. The options ORDER= and DESCENDING can be used to specify the sorting order for the levels of the imputed variables.
Copyright © 2009 by SAS Institute Inc., Cary, NC, USA. All rights reserved.