Previous Page | Next Page

Starting with Raw Data: Beyond the Basics

Testing a Condition before Creating an Observation

Sometimes you need to read a record, and hold that record in the input buffer while you test for a specified condition before a decision can be made about further processing. As an example, the ability to hold a record so that you can read from it again, if necessary, is useful when you need to test for a condition before SAS creates an observation from a data record. To do this, you can use the trailing at-sign (@).

For example, to create a SAS data set that is a subset of a larger group of records, you might need to test for a condition to decide if a particular record will be used to create an observation. The trailing at-sign placed before the semicolon at the end of an INPUT statement instructs SAS to hold the current data line in the input buffer. This makes the data line available for a subsequent INPUT statement. Otherwise, the next INPUT statement causes SAS to read a new record into the input buffer.

You can set up the process to read each record twice by following these steps:

  1. Use an INPUT statement to read a portion of the record.

  2. Use a trailing @ at the end of the INPUT statement to hold the record in the input buffer for the execution of the next INPUT statement.

  3. Use an IF statement on the portion that is read in to test for a condition.

  4. If the condition is met, use another INPUT statement to read the remainder of the record to create an observation.

  5. If the condition is not met, the record is released and control passes back to the top of the DATA step.

To read from a record twice, you must prevent SAS from automatically placing a new record into the input buffer when the next INPUT statement executes. Use of a trailing @ in the first INPUT statement serves this purpose. The trailing @ is one of two line-hold specifiers that enable you to hold a record in the input buffer for further processing.

For example, the health and fitness club data contains information about all members. This DATA step creates a SAS data set that contains only members of the red team:

data red_team;
   input Team $ 13-18 @;  1 
   if Team='red';  2 
   input IdNumber 1-4 StartWeight 20-22 EndWeight 24-26;  3 
   datalines;
1023 David  red    189 165
1049 Amelia yellow 145 124
1219 Alan   red    210 192
1246 Ravi   yellow 194 177
1078 Ashley red    127 118
1221 Jim    yellow 220   . 
;  4 

proc print data=red_team;  
   title 'Red Team';
run;

In this DATA step, these actions occur:

[1] The INPUT statement reads a record into the input buffer, reads a data value from columns 13 through 18, and assigns that value to the variable Team in the program data vector. The single trailing @ holds the record in the input buffer.

[2] The IF statement enables the current iteration of the DATA step to continue only when the value for Team is red . When the value is not red, the current iteration stops and SAS returns to the top of the DATA step, resets values in the program data vector to missing, and releases the held record from the input buffer.

[3] The INPUT statement executes only when the value of Team is red . It reads the remaining data values from the record held in the input buffer and assigns values to the variables IdNumber, StartWeight, and EndWeight.

[4] The record is released from the input buffer when the program returns to the top of the DATA step.

The following output shows the resulting data set:

Subset Data Set Created with Trailing @

                                    Red Team                                   1

                                    Id       Start      End
                   Obs    Team    Number    Weight    Weight

                    1     red      1023       189       165 
                    2     red      1219       210       192 
                    3     red      1078       127       118 

Previous Page | Next Page | Top of Page