Acting on Selected Observations |
Understanding Construct Conditions |
When you use an IF-THEN statement, you ask SAS to make a comparison. SAS must determine whether a value is equal to another value, greater than another value, and so on. SAS has six main comparison operators:
Symbol | Mnemonic Operator | Meaning |
---|---|---|
= | EQ | equal to |
¬=, ^= , ~= | NE | not equal to (the ¬, ^, or ~ symbol, depending on your keyboard) |
> | GT | greater than |
< | LT | less than |
>= | GE | greater than or equal to |
<= | LE | less than or equal to |
The symbols in the table are based on mathematical symbols; the letter abbreviations, known as mnemonic operators, have the same effect. Use the form that you prefer, but remember that you can use the mnemonic operators only in comparisons. For example, the equal sign in an assignment statement must be represented by the symbol =, not the mnemonic operator. Both of the following statements compare the number of nights in the tour to six:
The terms on each side of the comparison operator can be variables, expressions, or constants. The side a particular term appears on does not matter, as long as you use the correct operator. All of the following comparisons are constructed correctly for use in SAS statements:
Selecting an Observation Based on Simple Conditions |
The following DATA step illustrates some of these conditions:
options pagesize=60 linesize=80 pageno=1 nodate; data changes; set mylib.arttours; if Nights >= 6 then Stay = 'Week+'; else Stay = 'Days'; if LandCost ne . then Remarks = 'OK '; else Remarks = 'Redo'; if LandCost lt 600 then Budget = 'Low '; else Budget = 'Medium'; if NumberOfEvents / Nights > 2 then Pace = 'Too fast'; else Pace = 'OK'; run; proc print data=changes; var City Nights LandCost NumberOfEvents Stay Remarks Budget Pace; title 'Tour Information'; run;
The following output displays the results:
Assigning Values to Variables According to Specific Conditions
Tour Information 1 Land Number Obs City Nights Cost OfEvents Stay Remarks Budget Pace 1 Rome 3 750 7 Days OK Medium Too fast 2 Paris 8 1680 6 Week+ OK Medium OK 3 London 6 1230 5 Week+ OK Medium OK 4 New York 6 . 8 Week+ Redo Low OK 5 Madrid 3 370 5 Days OK Low OK 6 Amsterdam 4 580 6 Days OK Low OK
Using More Than One Comparison in a Condition |
You can specify more than one comparison in a condition with these operators:
A condition can contain any number of ANDs, ORs, or both.
When comparisons are connected by AND, all of the comparisons must be true for the condition to be true. Consider this example:
if City = 'Paris' and TourGuide = 'Lucas' then Remarks = 'Bilingual';
The comparison is true for observations in which the value of City is Paris and the value of TourGuide is Lucas .
A common comparison is to determine whether a value is between two quantities, greater than one quantity and less than another quantity. For example, to select observations in which the value of LandCost is greater than or equal to 1000, and less than or equal to 1500, you can write a comparison with AND:
if LandCost >= 1000 and LandCost <= 1500 then Price = '1000-1500';
A simpler way to write this comparison is
if 1000 <= LandCost <= 1500 then Price = '1000-1500';
This comparison has the same meaning as the previous one. You can use any of the operators <, <=, >, >=, or their mnemonic equivalents in this way.
The following DATA step includes these multiple comparison statements:
options pagesize=60 linesize=80 pageno=1 nodate; data showand; set mylib.arttours; if City = 'Paris' and TourGuide = 'Lucas' then Remarks = 'Bilingual'; if 1000 <= LandCost <= 1500 then Price = '1000-1500'; run; proc print data=showand; var City LandCost TourGuide Remarks Price; title 'Tour Information'; run;
The following output displays the results:
Using AND When Making Multiple Comparisons
Tour Information 1 Land Tour Obs City Cost Guide Remarks Price 1 Rome 750 D'Amico 2 Paris 1680 Lucas Bilingual 3 London 1230 Wilson 1000-1500 4 New York . Lucas 5 Madrid 370 Torres 6 Amsterdam 580
When comparisons are connected by OR, only one of the comparisons needs to be true for the condition to be true. Consider the following example:
if LandCost gt 1500 or LandCost / Nights gt 200 then Level = 'Deluxe';
Any observation in which the land cost is over $1500, the cost per night is over $200, or both, satisfies the condition. The following DATA step shows this condition:
options pagesize=60 linesize=80 pageno=1 nodate; data showor; set mylib.arttours; if LandCost gt 1500 or LandCost / Nights gt 200 then Level = 'Deluxe'; run; proc print data=showor; var City LandCost Nights Level; title 'Tour Information'; run;
The following output displays the results:
Using OR When Making Multiple Comparisons
Tour Information 1 Land Obs City Cost Nights Level 1 Rome 750 3 Deluxe 2 Paris 1680 8 Deluxe 3 London 1230 6 Deluxe 4 New York . 6 5 Madrid 370 3 6 Amsterdam 580 4
Be careful when you combine negative operators with OR. Often, the operator that you really need is AND. For example, the variable TourGuide contains some problems with the data. In the observation for Paris, the tour guide and the backup tour guide are both Lucas; in the observation for Amsterdam, the name of the tour guide is missing. You want to label the observations that have no problems with TourGuide as OK. Should you write the IF condition with OR or with AND?
The following DATA step shows both conditions:
options pagesize=60 linesize=80 pageno=1 nodate; data test; set mylib.arttours; if TourGuide ne BackUpGuide or TourGuide ne ' ' then GuideCheckUsingOR = 'OK'; else GuideCheckUsingOR = 'No'; if TourGuide ne BackUpGuide and TourGuide ne ' ' then GuideCheckUsingAND = 'OK'; else GuideCheckUsingAND = 'No'; run; proc print data = test; var City TourGuide BackUpGuide GuideCheckUsingOR GuideCheckUsingAND; title 'Negative Operators with OR and AND'; run;
The following output displays the results:
Using Negative Operators When Making Comparisons
Negative Operators with OR and AND 1 Guide Guide Tour BackUp Check Check Obs City Guide Guide UsingOR UsingAND 1 Rome D'Amico Torres OK OK 2 Paris Lucas Lucas OK 1 No 3 London Wilson Lucas OK OK 4 New York Lucas D'Amico OK OK 5 Madrid Torres D'Amico OK OK 6 Amsterdam Vandever OK 2 No
In the IF-THEN/ELSE statements that create GuideCheckUsingOR, only one comparison needs to be true to make the condition true. Note that for the Paris and Amsterdam observations in the data set MYLIB.ARTTOURS,
in the observation for Paris, TourGuide does not have a missing value and the comparison TourGuide NE ' ' is true. | |
for Amsterdam, the comparison TourGuide NE BackUpGuide is true. |
Because one OR comparison is true in each observation, GuideCheckUsingOR is labeled OK for all observations. The IF-THEN/ELSE statements that create GuideCheckUsingAND achieve better results. That is, the AND operator selects the observations in which the value of TourGuide is not the same as BackUpGuide and is not missing.
A condition can contain both ANDs and ORs. When it does, SAS evaluates the ANDs before the ORs. The following example specifies a list of cities and a list of guides:
/* first attempt */ if City = 'Paris' or City = 'Rome' and TourGuide = 'Lucas' or TourGuide = "D'Amico" then Topic = 'Art history';
SAS first joins the items that are connected by AND:
City = 'Rome' and TourGuide = 'Lucas'
Then SAS makes the following OR comparisons:
City = 'Paris' or City = 'Rome' and TourGuide = 'Lucas' or TourGuide = "D'Amico"
To group the City comparisons and the TourGuide comparisons, use parentheses:
/* correct method */ if (City = 'Paris' or City = 'Rome') and (TourGuide = 'Lucas' or TourGuide = "D'Amico") then Topic = 'Art history';
SAS evaluates the comparisons within parentheses first and uses the results as the terms of the larger comparison. You can use parentheses in any condition to control the grouping of comparisons or to make the condition easier to read.
The following DATA step illustrates these conditions:
options pagesize=60 linesize=80 pageno=1 nodate; data combine; set mylib.arttours; if (City = 'Paris' or City = 'Rome') and (TourGuide = 'Lucas' or TourGuide = "D'Amico") then Topic = 'Art history'; run; proc print data=combine; var City TourGuide Topic; title 'Tour Information'; run;
The following output displays the results:
Using Parentheses to Combine Comparisons with AND and OR
Tour Information 1 Tour Obs City Guide Topic 1 Rome D'Amico Art history 2 Paris Lucas Art history 3 London Wilson 4 New York Lucas 5 Madrid Torres 6 Amsterdam
Two points about numeric comparisons are especially helpful to know:
In computing terms, a value of TRUE is 1 and a value of FALSE is 0. In SAS, the following is true
Therefore, a numeric variable or expression can stand alone in a condition. If its value is a number other than 0 or if the value is missing, then the condition is true; if its value is 0 or missing, then the condition is false.
The following example assigns a value to the variable Remarks only if the value of LandCost is present for a given observation:
if LandCost then Remarks = 'Ready to budget';
This statement is equivalent to
if LandCost ne . and LandCost ne 0 then Remarks = 'Ready to budget';
Be careful when you abbreviate comparisons with OR; it is easy to produce unexpected results. For example, this IF-THEN statement selects tours that last six or eight nights:
/* first try */ if Nights = 6 or 8 then Stay = 'Medium';
SAS treats the condition as the following comparisons:
Nights=6 or 8
The second comparison does not use the values of Nights; it is simply the number 8 standing alone. Because the number 8 is neither 0 nor a missing value, it always has the value TRUE. Because only one comparison in a series of OR comparisons needs to be true to make the condition true, this condition is true for all observations.
The following comparisons correctly select observations that have six or eight nights:
/* correct way */ if Nights = 6 or Nights = 8 then Stay = 'Medium';
The following DATA step includes these IF-THEN statements:
options pagesize=60 linesize=80 pageno=1 nodate; data morecomp; set mylib.arttours; if LandCost then Remarks = 'Ready to budget'; else Remarks = 'Need land cost'; if Nights = 6 or Nights = 8 then Stay = 'Medium'; else Stay = 'Short'; run; proc print data=morecomp; var City Nights LandCost Remarks Stay; title 'Tour Information'; run;
The following output displays the results:
Abbreviating Numeric Comparisons
Tour Information 1 Land Obs City Nights Cost Remarks Stay 1 Rome 3 750 Ready to budget Short 2 Paris 8 1680 Ready to budget Medium 3 London 6 1230 Ready to budget Medium 4 New York 6 . Need land cost Medium 5 Madrid 3 370 Ready to budget Short 6 Amsterdam 4 580 Ready to budget Short
Copyright © 2012 by SAS Institute Inc., Cary, NC, USA. All rights reserved.