Previous Page | Next Page

Updating SAS Data Sets

Updating with Incremental Values

Some applications do not update a data set by overwriting values in the master data set with new values from a transaction data set. Instead, they update a variable by mathematically manipulating its value based on the value of a variable in the transaction data set.

In this example, a bookstore uses SAS to keep track of weekly sales and year-to-date sales. The program that follows creates, sorts by Title, and displays the data set, YEAR_SALES, which contains the year-to-date information.

data year_sales;
   input Title $ 1-25 Author $ 27-50 Sales;
   datalines;
The Milagro Beanfield War Nichols, John            303
The Stranger              Camus, Albert            150
Always Coming Home        LeGuin, Ursula            79
Falling through Space     Gilchrist, Ellen         128
Don Quixote               Cervantes, Miguel de      87
The Handmaid's Tale       Atwood, Margaret          64
;

proc sort data=year_sales;
   by title;
run;

proc print data=year_sales (obs=6);
   title 'Bookstore Sales, Year-to-Date';
   title2 'By Title';
run;

The following output displays the YEAR_SALES data set:

The YEAR_SALES Data Set, Sorted by Title

                         Bookstore Sales, Year-to-Date                         1
                                    By Title

       Obs    Title                        Author                  Sales

        1     Always Coming Home           LeGuin, Ursula            79 
        2     Don Quixote                  Cervantes, Miguel de      87 
        3     Falling through Space        Gilchrist, Ellen         128 
        4     The Handmaid's Tale          Atwood, Margaret          64 
        5     The Milagro Beanfield War    Nichols, John            303 
        6     The Stranger                 Camus, Albert            150 

Every Saturday a SAS data set is created containing information about all the books that were sold during the past week. The program following creates, sorts by Title, and displays the data set WEEK_SALES, which contains the current week's information.

data week_sales;
   input Title $ 1-25 Author $ 27-50 Sales;
   datalines;
The Milagro Beanfield War Nichols, John            32
The Stranger              Camus, Albert            17
Always Coming Home        LeGuin, Ursula           10
Falling through Space     Gilchrist, Ellen         12
The Accidental Tourist    Tyler, Anne              15
The Handmaid's Tale       Atwood, Margaret         8
;
proc sort data=week_sales;
   by title;
run;

proc print data=week_sales;
   title 'Bookstore Sales for Current Week';
   title2 'By Title';
run;

The following output shows the data set, which contains the same variables as the year-to-date data set, but the variable Sales represents sales for only one week:

The WEEK_SALES Data Set, Sorted by Title

                        Bookstore Sales for Current Week                       1
                                    By Title

         Obs    Title                             Author         Sales

          1     Always Coming Home           LeGuin, Ursula        10 
          2     Falling through Space        Gilchrist, Ellen      12 
          3     The Accidental Tourist       Tyler, Anne           15 
          4     The Handmaid's Tale          Atwood, Margaret       8 
          5     The Milagro Beanfield War    Nichols, John         32 
          6     The Stranger                 Camus, Albert         17 

Note:   If the transaction data set is updating only titles that are already in YEAR_SALES, it does not need to contain the variable Author. However, because this variable is there, the transaction data set can be used to add complete observations to the master data set.  [cautionend]

The program that follows uses the weekly information to update the year-to-date data set and displays the new data set.

data total_sales;
   drop NewSales; 3 
   update year_sales week_sales (rename=(Sales=NewSales)); 1 
   by Title;
   sales=sum(Sales,NewSales); 2 
run;

proc print data=total_sales;
   title 'Updated Year-to-Date Sales';
run;

The following list corresponds to the numbered items in the preceding program:

[1] The RENAME= data set option in the UPDATE statement changes the name of the variable Sales in the transaction data set (WEEK_SALES) to NewSales. As a result, these values do not replace the value of Sales that are read from the master data set (YEAR_SALES).

[2] The Sales value that is in the updated data set (TOTAL_SALES) is the sum of the year-to-date sales and the weekly sales.

[3] The program drops the variable NewSales because it is not needed in the new data set.

The following output shows that in addition to updating sales information for the titles already in the master data set, the UPDATE statement has added a new title, The Accidental Tourist.

Updating Year-to-Date Sales with Weekly Sales

                           Updated Year-to-Date Sales                          1

       Obs    Title                        Author                  Sales

        1     Always Coming Home           LeGuin, Ursula            89 
        2     Don Quixote                  Cervantes, Miguel de      87 
        3     Falling through Space        Gilchrist, Ellen         140 
        4     The Accidental Tourist       Tyler, Anne               15 
        5     The Handmaid's Tale          Atwood, Margaret          72 
        6     The Milagro Beanfield War    Nichols, John            335 
        7     The Stranger                 Camus, Albert            167 

Previous Page | Next Page | Top of Page