Missing Values

When fitting a model, the ADAPTIVEREG procedure excludes observations that have missing values for the response variable, weight variable, or frequency variable. It also excludes observations with invalid response, weight, or frequency values. For observations that have valid response, weight, and frequency values but missing predictor values, the ADAPTIVEREG procedure can either include them in model fitting or exclude them.

By default, observations with missing values in the predictor variables are included in the model fitting. Suppose a variable $\mb {v}$ contains missing values. The ADAPTIVEREG procedure automatically forms two candidate bases, $\mb {B}_ m$ and $\mb {B}_{m+1}$, in the forward selection step when variable $\mb {v}$ is considered. When v is missing, $\mb {B}_{m+1}=I(v\mathrm{~ is~ missing})$. When v is not missing, $\mb {B}_ m=I(v\mathrm{~ is~ not~ missing})$. $I(\cdot )$ is a scalar-valued indicator function that returns a 1 when the argument is true and a 0 when the argument is false.

If the transformation of $\mb {v}$ with a parent basis $\mb {B}_ i$ and a knot (or a subset) t turns out to be the best one during this iteration, then two more bases are added to the model:

\[  \mb {B}_{m+2}=\mb {B}_ i \mb {B}_{m}\mb {T}_1(v-t)  \]
\[  \mb {B}_{m+3}=\mb {B}_ i \mb {B}_{m+1}\mb {T}_2(v-t)  \]

The indicator function does not contribute to the interaction order of the constructed bases. This approach assumes that the missingness in the training data is representative of missingness in future data to be predicted.

Alternatively, you can specify the NOMISS option in the MODEL statement to exclude from the model fitting all observations that have missing values in the predictor variables.