Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The MI Procedure

PROC MI Statement

PROC MI < options > ;
The following table summarizes the options available in the PROC MI statement.

Table 9.1: Summary of PROC MI Options
Tasks Options
Specify data sets  
 input data set DATA=
 output data set with imputed values OUT=
Specify imputation details  
 number of imputations NIMPUTE=
 seed to begin random number generator SEED=
 units to round imputed variable values ROUND=
 maximum values for imputed variable values MAXIMUM=
 minimum values for imputed variable values MINIMUM=
 singularity tolerance SINGULAR=
Specify statistical analysis  
 level for the confidence interval, (1-{\alpha}) ALPHA=
 means under the null hypothesis MU0=
Control printed output  
 suppress all displayed output NOPRINT
 displays univariate statistics and correlations SIMPLE


The following options can be used in the PROC MI statement (in alphabetical order):

ALPHA= \alpha
specifies that confidence limits be constructed for the mean estimates with confidence level 100(1-\alpha)\%, where 0\lt\alpha\lt 1.The default is ALPHA=0.05.

DATA=SAS-data-set
names the SAS data set to be analyzed by PROC MI. By default, the procedure uses the most recently created SAS data set.

MAXIMUM=numbers
specifies maximum values for imputed variables. When an intended imputed value is greater than the maximum, PROC MI redraws another value for imputation. If only one number is specified, that number is used for all variables. If more than one number is specified, you must use a VAR statement, and the specified numbers must correspond to variables in the VAR statement. A missing value indicates no restriction on the maximum for the corresponding variable. The default is MAXIMUM=., no restriction on the maximum.

The MAXIMUM= option is related to the MINIMUM= and ROUND= options, which are used to make the imputed values more consistent with the observed variable values. These options are not applicable if you specify the METHOD=PROPENSITY option in the MONOTONE statement.

When specifying a maximum for the first variable only, you must also specify a missing value after the maximum. Otherwise, the maximum is used for all variables. For example, the MAXIMUM= 100 . option sets a maximum of 100 for the first analysis variable only and no maximum for the remaining variables. The MAXIMUM= . 100 option sets a maximum of 100 for the second analysis variable only and no maximum for the other variables.

MINIMUM=numbers
specifies the minimum values for imputed variables. When an intended imputed value is less than the minimum, PROC MI redraws another value for imputation. If only one number is specified, that number is used for all variables. If more than one number is specified, you must use a VAR statement, and the specified numbers must correspond to variables in the VAR statement. A missing value indicates no restriction on the minimum for the corresponding variable. The default is MINIMUM=., no restriction on the minimum.

MU0=numbers
THETA0=numbers
specifies the parameter values mu_{0} under the null hypothesis mu=mu_{0} for the population means corresponding to the analysis variables. Each hypothesis is tested with a t test. If only one number is specified, that number is used for all variables. If more than one number is specified, you must use a VAR statement, and the specified numbers must correspond to variables in the VAR statement. The default is MU0=0.

If a variable is transformed as specified in a TRANSFORM statement, then the same transformation for that variable is also applied to its corresponding specified MU0= value in the t test. If the parameter values mu_{0} for a transformed variable is not specified, then mu_{0}=0 is used for that transformed variable.

NIMPUTE=number
specifies the number of imputations. The default is NIMPUTE=5. You can specify NIMPUTE=0 to skip the imputation. In this case, only tables of model information, missing data patterns, descriptive statistics (SIMPLE option), and MLE from the EM algorithm (EM statement) are displayed.

NOPRINT
suppresses the display of all output. Note that this option temporarily disables the Output Delivery System (ODS). For more information, refer to the chapter "Using the Output Delivery System" in the SAS/STAT User's Guide, Version 8.

OUT=SAS-data-set
creates an output SAS data set containing imputation results. The data set includes an index variable, _Imputation_, to identify the imputation number. For each imputation, the data set contains all variables in the input data set with missing values replaced by the imputed values. See the "Output Data Sets" section for a description of this data set.

If you want to create a permanent SAS data set, you must specify a two-level name. For more information on permanent SAS data sets, refer to the section "SAS Files" in SAS Language Reference: Concepts, Version 8.

ROUND=numbers
specifies the units to round variables in the imputation. If only one number is specified, that number is used for all variables. If more than one number is specified, you must use a VAR statement, and the specified numbers must correspond to variables in the VAR statement. The default number is a missing value, which indicates no rounding for imputed variables.

When specifying a roundoff unit for the first variable only, you must also specify a missing value after the roundoff unit. Otherwise, the roundoff unit is used for all variables. For example, the option "ROUND= 10 ." sets a roundoff unit of 10 for the first analysis variable only and no rounding for the remaining variables. The option "ROUND= . 10" sets a roundoff unit of 10 for the second analysis variable only and no rounding for other variables.

You can use the ROUND= option to set the precision of imputed values. For example, with a roundoff unit of 0.001, each value is rounded to the nearest multiple of 0.001. That is, each value has three significant digits after the decimal point. See Example 9.3 for a usage of this option.

SEED=number
specifies a positive integer. PROC MI uses the value of the SEED= option to start the pseudo-random number generator. The default is a value generated from reading the time of day from the computer's clock. However, in order to duplicate the results under identical situations, you must control the value of the seed explicitly rather than rely on the clock reading.

The seed information is displayed in the "Model Information" table so that the results can be reproduced by specifying this seed with the SEED= option. You need to specify the same seed number in the future to reproduce the results.

SIMPLE
displays simple descriptive univariate statistics and pairwise correlations from available cases. For a detailed description of these statistics, see the "Descriptive Statistics" section.

SINGULAR=p
specifies the criterion for determining the singularity of a covariance matrix, where 0<p<1. The default is SINGULAR=1E-8.

Suppose that S is a covariance matrix and v is the number of variables in S. Based on the spectral decomposition S={\Gamma \Lambda \Gamma}^', where {\Lambda} is a diagonal matrix of eigenvalues \lambda_j, j = 1, ..., v, where \lambda_i\ge \lambda_j when i<j, and {\Gamma} is a matrix with the corresponding orthonormal eigenvectors of S as columns, S is considered singular when an eigenvalue \lambda_j is less than p \bar{\lambda},where the average \bar{\lambda}=\sum_{k=1}^v \lambda_k /v.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 2001 by SAS Institute Inc., Cary, NC, USA. All rights reserved.