Previous Page | Next Page

Using More Than One Observation in a Calculation

Obtaining a Total for Each BY Group

An additional requirement of Tradewinds Travel is to determine the number of tours that are scheduled with each vendor. In order to accomplish this task, a program must group the data by a variable; that is, the program must organize the data set into groups of observations, with one group for each vendor. In this case, the program must group the data by the Vendor variable. Each group is known generically as a BY group; the variable that is used to determine the groupings is called a BY variable.

In order to group the data by the Vendor variable, the program must

The following program sorts the data set by Vendor and sums the total schedulings for each vendor.
options pagesize=60 linesize=80 pageno=1 nodate;
proc sort  data=mylib.tourrevenue out=mylib.sorttour;
    by Vendor;
run;

data totalby;
   set mylib.sorttour;
   by Vendor;
   if First.Vendor then VendorBookings = 0;
   VendorBookings + NumberOfBookings;
   run;

   proc print data=totalby;
      title 'Summary of Bookings by Vendor';
   run;

In the preceding program, the FIRST.Vendor variable is used in an IF-THEN statement to set the sum variable (VendorBookings) to 0 in the first observation of each BY group. (For more information on the FIRST.variable and LAST.variable temporary variables, see Finding the First or Last Observation in a Group.)

The following output displays the results.

Creating Totals for BY Groups

                         Summary of Bookings by Vendor                         1

                                                    Number
                               Land                   Of        Vendor
         Obs    Country        Cost    Vendor      Bookings    Bookings

           1    France          575    Express        10          10   
           2    India           489    Express         .          10   
           3    Japan           720    Express        10          20   
           4    Greece          698    Express        20          40   
           5    Italy           468    Express         9          49   
           6    Ireland         558    Express         9          58   
           7    New Zealand    1489    Southsea        6           6   
           8    Australia      1079    Southsea       10          16   
           9    Spain           510    World          12          12   
          10    Brazil          540    World           6          18   
          11    Venezuela       425    World           8          26   
          12    USSR            924    World           6          32   
          13    Switzerland     734    World          20          52   

Notice that while this output does in fact include the total number of schedulings for each vendor, it also includes a great deal of extraneous information. Reporting the total schedulings for each vendor requires only the variables Vendor and VendorBookings from the last observation for each vendor. Therefore, the program can

The following program creates data set TOTALBY:

options pagesize=60 linesize=80 pageno=1 nodate;
proc sort data=mylib.tourrevenue out=mylib.sorttour;
   by Vendor;
run;

data totalby(drop=country landcost);
   set mylib.sorttour;
   by Vendor;
   if First.Vendor then VendorBookings = 0;
   VendorBookings + NumberOfBookings;
   if Last.Vendor;
run;

proc print data=totalby;
   title 'Total Bookings by Vendor';
run;

The following output displays the results:

Putting Totals for Each BY Group in a New Data Set

                            Total Bookings by Vendor                           1

                                                       Vendor
                                   Obs    Vendor      Bookings

                                    1     Express        58   
                                    2     Southsea       16   
                                    3     World          52   

Previous Page | Next Page | Top of Page