Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The MI Procedure

Propensity Score Method for Monotone Missing Data

A propensity score is generally defined as the conditional probability of assignment to a particular treatment given a vector of observed covariates (Rosenbaum and Rubin 1983). In the propensity score method, for each variable with missing values, a propensity score is generated for each observation to estimate the probability that the observation is missing. The observations are then grouped based on these propensity scores, and an approximate Bayesian bootstrap imputation (Rubin 1987, p. 124) is applied to each group (Lavori, Dawson, and Shera 1995).

A data set with variables Y1, Y2, ..., Yp (in that order) is said to have a monotone missing pattern when the event that a variable Yj is observed for a particular individual implies that all previous variables Yk, k < j, are also observed for that individual. The propensity score method uses the following steps to impute values for each variable Yj with missing values:

1. Create an indicator variable Rj with the value 0 for observations with missing Yj and 1 otherwise.

2. Fit a logistic regression model

{\rm logit} (p_{j})={\beta}_{0} + {\beta}_{1} \, Y_{1} + {\beta}_{2} \, Y_{2} + ... + {\beta}_{j-1} \, Y_{j-1}
where pj = Pr( Rj=0 | Y1, Y2, ... , Yj-1 ) and logit (p) = log ( p / (1-p) ).

3. Create a propensity score for each observation to estimate the probability that it is missing.

4. Divide the observations into a fixed number of groups (typically assumed to be five) based on these propensity scores.

5. Apply an approximate Bayesian bootstrap imputation to each group. In group k, suppose that Yobs denotes the n1 observations with nonmissing Yj values and Ymis denotes the n0 observations with missing Yj. The approximate Bayesian bootstrap imputation first draws n1 observations randomly with replacement from Yobs to create a new data set Yobs*. This is a nonparametric analogue of drawing parameters from the posterior predictive distribution of the parameters. The process then draws the n0 values for Ymis randomly with replacement from Yobs*.

Steps 1 through 5 are repeated sequentially for each variable with missing values.

Note that the propensity score method was originally designed for a randomized experiment with repeated measures on the response variables. The goal was to impute the missing values on the response variables. The method uses only the covariate information that is associated with whether the imputed variable values are missing. It does not use correlations among variables. It is effective for inferences about the distributions of individual imputed variables, such as an univariate analysis, but it is not appropriate for analyses involving relationship among variables, such as a regression analysis. It can also produce badly biased estimates of regression coefficients when data on predictor variables are missing (Allison 2000).

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 2001 by SAS Institute Inc., Cary, NC, USA. All rights reserved.