Previous Page | Next Page

Modifying SAS Data Sets

Understanding How Duplicate BY Variables Affect File Update


How the DATA Step Processes Duplicate BY Variables

When you use a BY statement with MODIFY, both the master and the transaction data sets can have observations with duplicate values of BY variables. Neither the master nor the transaction data set needs to be sorted, because BY-group processing uses dynamic WHERE processing to find an observation in the master data set.

The DATA step processes duplicate observations in the following ways:


The Program

The program in this section updates the master data set INVENTORY_2 with observations from the transaction data set ADD_INVENTORY_2. Both data sets contain consecutive and nonconsecutive duplicate values of the BY variable PartNumber.

The following program creates the master data set INVENTORY_2. Note that the data set contains three observations for PartNumber M4J7.

data inventory_2;
   input PartNumber $ Description $ InStock @17 
         ReceivedDate date9. @27 Price;
   format  ReceivedDate date9.;
   datalines;
K89R seal   34  27jul1998 245.00
M4J7 sander 98  20jun1998 45.88
M4J7 sander 98  20jun1998 45.88
LK43 filter 121 19may1999 10.99
MN21 brace 43   10aug1999 27.87
M4J7 sander 98  20jun1998 45.88
BC85 clamp 80   16aug1999 9.55
NCF3 valve 198  20mar1999 24.50
KJ66 cutter 6   18jun1999 19.77
;

The following program creates the transaction data set ADD_INVENTORY_2, and then modifies the master data set INVENTORY_2. Note that the data set ADD_INVENTORY_2 contains three observations for PartNumber M4J7.

options pagesize=60 linesize=80 pageno=1 nodate;

data add_inventory_2;
   input PartNumber $ Description $ NewStock;
   datalines;
K89R abc 17
M4J7 def 72
M4J7 ghi 66
LK43 jkl 311 
M4J7 mno 43
BC85 pqr 75
;

data inventory_2;
   modify inventory_2 add_inventory_2;
   by PartNumber;
   ReceivedDate=today(); 
   InStock=InStock+NewStock; 
run;

proc print data=inventory_2;
   title "Tool Warehouse Inventory";
run;

The following output shows the results:

The Updated INVENTORY_2 Data Set: Duplicate BY Variables

                            Tool Warehouse Inventory                           1

                  Part                      In      Received
          Obs    Number    Description    Stock         Date     Price

           1      K89R       abc            51     22JAN2001    245.00
           2      M4J7       mno           279     22JAN2001     45.88
           3      M4J7       sander         98     20JUN1998     45.88
           4      LK43       jkl           432     22JAN2001     10.99
           5      MN21       brace          43     10AUG1999     27.87
           6      M4J7       sander         98     20JUN1998     45.88
           7      BC85       pqr           155     22JAN2001      9.55
           8      NCF3       valve         198     20MAR1999     24.50
           9      KJ66       cutter          6     18JUN1999     19.77

Previous Page | Next Page | Top of Page