Previous Page | Next Page

Acting on Selected Observations

Selecting Observations


Understanding the Selection Process

The most common way that SAS selects observations for action in a DATA step is through the IF-THEN statement:

IF condition THEN action;

The condition is one or more comparisons, for example,

(The symbol > stands for greater than. You will see how to use symbols as comparison operators in Understanding Construct Conditions.)

For a given observation, a comparison is either true or false. In the first example, the value of City is either Rome or it is not. In the second example, the value of NumberOfEvents in the current observation is either greater than the value of Nights in the same observation or it is not. If the condition contains more than one comparison, as in the third example, then SAS evaluates all of them according to its rules (discussed later) and declares the entire condition to be true or false.

When the condition is true, SAS takes the action in the THEN clause. The action must be expressed as a SAS statement that can be executed in an individual iteration of the DATA step. Such statements are called executable statements. The most common executable statements are assignment statements, such as

This section concentrates on assignment statements in the THEN clause, but examples in other sections show other types of statements that are used with the THEN clause.

Statements that provide information about a data set are not executable. Such statements are called declarative statements. For example, the LENGTH statement affects a variable as a whole, not how the variable is treated in a particular observation. Therefore, you cannot use a LENGTH statement in a THEN clause.

When the condition is false, SAS ignores the THEN clause and proceeds to the next statement in the DATA step.


Selecting Observations Based on a Simple Condition

The following DATA step uses the previous example conditions and actions in IF-THEN statements:

options pagesize=60 linesize=80 pageno=1 nodate;
data revise;
   set mylib.arttours;
   if City = 'Rome' then LandCost = LandCost + 30;
   if NumberOfEvents > Nights then Calendar = 'Check schedule';
   if TourGuide = 'Lucas' and Nights > 7 then TourGuide = 'Torres';
run;

proc print data=revise;
   var City Nights LandCost NumberOfEvents TourGuide Calendar;
   title 'Tour Information';
run;

The following output displays the results:

Selecting Observations with IF-THEN Statements

                                Tour Information                               1

                                Land     Number      Tour
  Obs    City         Nights    Cost    OfEvents     Guide        Calendar 2 

   1     Rome            3       780 1      7       D'Amico    Check schedule
   2     Paris           8      1680        6       Torres  3                   
   3     London          6      1230        5       Wilson                   
   4     New York        6         .        8       Lucas      Check schedule
   5     Madrid          3       370        5       Torres     Check schedule
   6     Amsterdam       4       580        6                  Check schedule

You can see in the output that

[1] the land cost was increased by $30 in the observation for Rome

[2] four observations have a greater number of events than they do number of days in the tour

[3] the tour guide for Paris is replaced by Torres because the original tour guide is Lucas and the number of nights in the tour is greater than 7


Providing an Alternative Action

Remember that SAS creates a variable in all observations, even if you do not assign the variable a value in all observations. In the previous output, the value of Calendar is blank in two observations. A second IF-THEN statement can assign a different value, as in these examples:

if NumberOfEvents > Nights then Calendar = 'Check schedule';
if NumberOfEvents <= Nights then Calendar = 'No problems';

(The symbol <= means less than or equal to.) In this case, SAS compares the values of Events and Nights twice, once in each IF condition. A more efficient way to provide an alternative action is to use an ELSE statement:

ELSE action;

An ELSE statement names an alternative action to be taken when the IF condition is false. It must immediately follow the corresponding IF-THEN statement, as shown here:

if NumberOfEvents > Nights then Calendar = 'Check schedule';
else Calendar = 'No problems';

The REVISE2 DATA step adds the preceding ELSE statement to the previous DATA step:

options pagesize=60 linesize=80 pageno=1 nodate;
data revise2;
   set mylib.arttours;
   if City = 'Rome' then LandCost = LandCost + 30;
   if NumberOfEvents > Nights then Calendar = 'Check schedule';
   else Calendar = 'No problems';
   if TourGuide = 'Lucas' and Nights > 7 then TourGuide = 'Torres';
run;

proc print data=revise2;
   var City Nights LandCost NumberOfEvents TourGuide Calendar;
   title 'Tour Information';
run;

The following output displays the results:

Providing an Alternative Action with the ELSE Statement

                                Tour Information                               1

                                Land     Number      Tour
  Obs    City         Nights    Cost    OfEvents     Guide        Calendar

   1     Rome            3       780        7       D'Amico    Check schedule
   2     Paris           8      1680        6       Torres     No problems   
   3     London          6      1230        5       Wilson     No problems   
   4     New York        6         .        8       Lucas      Check schedule
   5     Madrid          3       370        5       Torres     Check schedule
   6     Amsterdam       4       580        6                  Check schedule

Creating a Series of Mutually Exclusive Conditions

Using an ELSE statement after an IF-THEN statement provides one alternative action when the IF condition is false. However, many cases involve a series of mutually exclusive conditions, each of which requires a separate action. In this example, tour prices can be classified as high, medium, or low. A series of IF-THEN and ELSE statements classifies the tour prices appropriately:

if LandCost >= 1500 then Price = 'High  ';
else if LandCost >= 700 then Price = 'Medium';
     else Price = 'Low';

(The symbol >= is greater than or equal to.) To see how SAS executes this series of statements, consider two observations: Amsterdam, whose value of LandCost is 580, and Paris, whose value is 1680.

When the value of LandCost is 580:

  1. SAS tests whether 580 is equal to or greater than 1500, determines that the comparison is false, ignores the THEN clause, and proceeds to the ELSE statement.

  2. The action in the ELSE statement is to evaluate another condition. SAS tests whether 580 is equal to or greater than 700, determines that the comparison is false, ignores the THEN clause, and proceeds to the accompanying ELSE statement.

  3. SAS executes the action in the ELSE statement and assigns Price the value Low .

When the value of LandCost is 1680:

  1. SAS tests whether 1680 is greater than or equal to 1500, determines that the comparison is true, and executes the action in the THEN clause. The value of Price becomes High .

  2. SAS ignores the ELSE statement. Because the entire remaining series is part of the first ELSE statement, SAS skips all remaining actions in the series.

A simple way to think of these actions is to remember that when an observation satisfies one condition in a series of mutually exclusive IF-THEN/ELSE statements, SAS processes that THEN action and skips the rest of the statements. (Therefore, you can increase the efficiency of a program by ordering the IF-THEN/ELSE statements so that the most common conditions appear first.)

The following DATA step includes the preceding series of statements:

options pagesize=60 linesize=80 pageno=1 nodate;
data prices;
   set mylib.arttours;
   if LandCost >= 1500 then Price = 'High   ';
   else if LandCost >= 700 then Price = 'Medium';
          else Price = 'Low';
run;

proc print data=prices;
   var City LandCost Price;
   title 'Tour Prices';
run;

The following output displays the results:

Assigning Mutually Exclusive Values with IF-THEN/ELSE Statements

                                  Tour Prices                                  1

                                           Land
                       Obs    City         Cost    Price

                        1     Rome          750    Medium
                        2     Paris        1680    High  
                        3     London       1230    Medium
                        4     New York        .    Low   
                        5     Madrid        370    Low   
                        6     Amsterdam     580    Low   

Note the value of Price in the fourth observation. The Price value is Low because the LandCost value for the New York trip is a missing value. Remember that a missing value is the lowest possible numeric value.

Previous Page | Next Page | Top of Page