Replace Missing Values Task

About the Replace Missing Values Task

The Replace Missing Values task performs high-performance numeric variable imputation. Imputation is a common step in data preparation. This task can replace numeric missing values with a specified value. This task can also replace numeric missing values with the mean, the pseudo-median, or some random value between the minimum value and the maximum value of the nonmissing values.

Assigning Data to Roles

Role
Description
Roles
Replace missing values with the mean
replaces missing values with the mean for the variable.
Replace missing values with the pseudo-median
replaces missing values with the pseudo-median of the variable. If there is no nonmissing value, the pseudo-median is 0.
Replace missing values with a random number
replaces missing values with a random value that is drawn between the minimum and maximum of the variable. If there is no nonmissing value, the random value is 0.
Additional Roles
Frequency count
specifies a numeric variable that contains the frequency of occurrence for each observation. If the frequency value is less than 1 or is missing, the observation is not used in the analysis. If no variable is assigned to the Frequency count role, each observation is assigned a frequency of 1.

Setting Options

You can specify whether to create an output data set. This output data set includes the data, imputation indicator variables (0 for not imputed or 1 for imputed), and imputed variables. You can also include any variables from the input data set.
By default, this table is saved in the Work library.