Previous Page | Next Page

Missing Values

When Variable Values Are Automatically Set to Missing by SAS


When Reading Raw Data

At the beginning of each iteration of the DATA step, SAS sets the value of each variable you create in the DATA step to missing, with the following exceptions:

SAS replaces the missing values as it encounters values that you assign to the variables. Thus, if you use program statements to create new variables, their values in each observation are missing until you assign the values in an assignment statement, as shown in the following DATA step:

   data new;
      input x;
      if x=1 then y=2;
      datalines;
   4
   1
   3
   1
   ;

This DATA step produces a SAS data set with the following variable values:

   OBS   X   Y
    1    4   .
    2    1   2
    3    3   .
    4    1   2

When X equals 1, the value of Y is set to 2. Since no other statements set Y's value when X is not equal to 1, Y remains missing (.) for those observations.


When Reading a SAS Data Set

When variables are read with a SET, MERGE, or UPDATE statement, SAS sets the values to missing only before the first iteration of the DATA step. (If you use a BY statement, the variable values are also set to missing when the BY group changes.) The variables retain their values until new values become available; for example, through an assignment statement or through the next execution of the SET, MERGE, or UPDATE statement. Variables created with options in the SET, MERGE, and UPDATE statements also retain their values from one iteration to the next.

When all of the rows in a data set in a match-merge operation (with a BY statement) have been processed, the variables in the output data set retain their values as described earlier. That is, as long as there is no change in the BY value in effect when all of the rows in the data set have been processed, the variables in the output data set retain their values from the final observation. FIRST.variable and LAST.variable, the automatic variables that are generated by the BY statement, both retain their values. Their initial value is 1.

When the BY value changes, the variables are set to missing and remain missing because the data set contains no additional observations to provide replacement values. When all of the rows in a data set in a one-to-one merge operation (without a BY statement) have been processed, the variables in the output data set are set to missing and remain missing.

Previous Page | Next Page | Top of Page