Previous Page | Next Page

Using More Than One Observation in a Calculation

Accumulating a Total for an Entire Data Set


Creating a Running Total

The first task in performing calculations on the data set MYLIB.TOURREVENUE is to find out the total number of people who scheduled tours with Tradewinds Travel. Therefore, a variable is needed whose value starts at 0 and increases by the number of schedulings in each observation. The sum statement gives you that capability:

variable + expression

In a sum statement, the value of the variable on the left side of the plus sign is 0 before the statement is processed for the first time. Processing the statement adds the value of the expression on the right side of the plus sign to the initial value; the sum variable retains the new value until the next processing of the statement. The sum statement ignores a missing value for the expression; the previous total remains unchanged.

The following statement creates the total number of schedulings :

TotalBookings + NumberOfBookings;

The following DATA step includes the sum statement above:

options pagesize=60 linesize=80 pageno=1 nodate;
data total;
   set mylib.tourrevenue;
   TotalBookings + NumberOfBookings;
run;

proc print data=total;
   var Country NumberOfBookings TotalBookings;
   title 'Total Tours Booked';
run;

The following output displays the results:

Accumulating a Total for a Data Set

                               Total Tours Booked                              1

                                          Number
                                            Of         Total
                   Obs    Country        Bookings    Bookings

                     1    France            10           10  
                     2    Spain             12           22  
                     3    Brazil             6           28  
                     4    India              .           28  
                     5    Japan             10           38  
                     6    Greece            20           58  
                     7    New Zealand        6           64  
                     8    Venezuela          8           72  
                     9    Italy              9           81  
                    10    USSR               6           87  
                    11    Switzerland       20          107  
                    12    Australia         10          117  
                    13    Ireland            9          126  
The TotalBookings variable in the last observation of the TOTAL data set contains the total number of schedulings for the year.

Printing Only the Total

If the total is the only information that is needed from the data set, a data set that contains only one observation and one variable (the TotalBookings variable) can be created by writing a DATA step that does all of the following:

When the END= option in the SET statement is specified, the variable that is named in the END= option is set to 1 when the DATA step is processing the last observation; the variable that is named in the END= option is set to 0 for other observations:

SET SAS-data-set <END=variable>;

SAS does not add the END= variable to the data set that is being created. By testing the value of the END= variable, you can determine which observation is the last observation.

The following program selects the last observation with a subsetting IF statement and uses a KEEP= data set option to keep only the variable TotalBookings in the data set:

options pagesize=60 linesize=80 pageno=1 nodate;
data total2(keep=TotalBookings);
   set mylib.tourrevenue end=Lastobs;
   TotalBookings + NumberOfBookings;
   if Lastobs;
run;

proc print data=total2;
   title 'Total Number of Tours Booked';
run;

The following output displays the results:

Selecting the Last Observation in a Data Set

                          Total Number of Tours Booked                         1

                                         Total
                                Obs    Bookings

                                 1        126  

The condition in the subsetting IF statement is true when Lastobs has a value of 1. When SAS is processing the last observation from MYLIB.TOURREVENUE, it assigns to Lastobs the value 1. Therefore, the subsetting IF statement accepts only the last observation from MYLIB.TOURREVENUE, and SAS writes the last observation to the data set TOTAL2.

Previous Page | Next Page | Top of Page