Using More Than One Observation in a Calculation |
A further requirement of Tradewinds Travel is a separate SAS data set that contains the tour that generated the most revenue. (The revenue total equals the price of the tour multiplied by the number of schedulings.) One method of creating the new data set might be to follow these three steps:
A more efficient method compares the revenue from all observations in a single DATA step. SAS can retain a value from the current observation to use in future observations. When the processing of the DATA step reaches the next observation, the held value represents information from the previous observation.
The RETAIN statement causes a variable that is created in the DATA step to retain its value from the current observation into the next observation rather than being set to missing at the beginning of each iteration of the DATA step. It is a declarative statement, not an executable statement. This statement has the following form:
RETAIN variable-1 < . . . variable-n>; |
To compare the Revenue value in one observation to the Revenue value in the next observation, create a retained variable named HoldRevenue and assign the value of the current Revenue variable to it. In the next observation, the HoldRevenue variable contains the Revenue value from the previous observation, and its value can be compared to that of Revenue in the current observation.
To see how the RETAIN statement works, look at the next example. The following DATA step outputs observations to data set TEMP before SAS assigns the current revenue to HoldRevenue:
options pagesize=60 linesize=80 pageno=1 nodate; data temp; set mylib.tourrevenue; retain HoldRevenue; Revenue = LandCost * NumberOfBookings; output; HoldRevenue = Revenue; run; proc print data=temp; var Country LandCost NumberOfBookings Revenue HoldRevenue; title 'Tour Revenue'; run;
The following output displays the results:
Retaining a Value By Using the Retain Statement
Tour Revenue 1 Number Land Of Hold Obs Country Cost Bookings Revenue Revenue 1 France 575 10 5750 . 2 Spain 510 12 6120 5750 3 Brazil 540 6 3240 6120 4 India 489 . . 3240 5 Japan 720 10 7200 . 6 Greece 698 20 13960 7200 7 New Zealand 1489 6 8934 13960 8 Venezuela 425 8 3400 8934 9 Italy 468 9 4212 3400 10 USSR 924 6 5544 4212 11 Switzerland 734 20 14680 5544 12 Australia 1079 10 10790 14680 13 Ireland 558 9 5022 10790
The value of HoldRevenue is missing at the beginning of the first observation; it is still missing when the OUTPUT statement writes the first observation to TEMP. After the OUTPUT statement, an assignment statement assigns the value of Revenue to HoldRevenue. Because HoldRevenue is retained, that value is present at the beginning of the next iteration of the DATA step. When the OUTPUT statement executes again, the value of HoldRevenue still contains that value.
To find the largest value of Revenue, assign the value of Revenue to HoldRevenue only when Revenue is larger than HoldRevenue, as shown in the following program:
options pagesize=60 linesize=80 pageno=1 nodate; data mostrevenue; set mylib.tourrevenue; retain HoldRevenue; Revenue = LandCost * NumberOfBookings; if Revenue > HoldRevenue then HoldRevenue = Revenue; run; proc print data=mostrevenue; var Country LandCost NumberOfBookings Revenue HoldRevenue; title 'Tour Revenue'; run;
The following output displays the results:
Holding the Largest Value in a Retained Variable
Tour Revenue 1 Number Land Of Hold Obs Country Cost Bookings Revenue Revenue 1 France 575 10 5750 5750 2 Spain 510 12 6120 6120 3 Brazil 540 6 3240 6120 4 India 489 . . 6120 5 Japan 720 10 7200 7200 6 Greece 698 20 13960 13960 7 New Zealand 1489 6 8934 13960 8 Venezuela 425 8 3400 13960 9 Italy 468 9 4212 13960 10 USSR 924 6 5544 13960 11 Switzerland 734 20 14680 14680 12 Australia 1079 10 10790 14680 13 Ireland 558 9 5022 14680
The value of HoldRevenue in the last observation represents the largest revenue that is generated by any tour. To determine which observation the value came from, create a variable named HoldCountry to hold the name of the country from the observations with the largest revenue. Include HoldCountry in the RETAIN statement to retain its value until explicitly changed. Then use the END= data set option to select the last observation, and use the KEEP= data set option to keep only HoldRevenue and HoldCountry in MOSTREVENUE.
options pagesize=60 linesize=80 pageno=1 nodate; data mostrevenue (keep=HoldCountry HoldRevenue); set mylib.tourrevenue end=LastOne; retain HoldRevenue HoldCountry; Revenue = LandCost * NumberOfBookings; if Revenue > HoldRevenue then do; HoldRevenue = Revenue; HoldCountry = Country; end; if LastOne; run; proc print data=mostrevenue; title 'Country with the Largest Value of Revenue'; run;
Note: The program uses a DO group. Using DO groups enables the program to evaluate a condition once and take more than one action as a result. For more information on DO groups, see Performing More Than One Action in an IF-THEN Statement.
The following output displays the results:
Selecting a New Data Set Using RETAIN and Subsetting IF Statements
Country with the Largest Value of Revenue 1 Hold Obs Revenue HoldCountry 1 14680 Switzerland
Copyright © 2012 by SAS Institute Inc., Cary, NC, USA. All rights reserved.