Modifying SAS Data Sets |
How the DATA Step Processes Duplicate BY Variables |
When you use a BY statement with MODIFY, both the master and the transaction data sets can have observations with duplicate values of BY variables. Neither the master nor the transaction data set needs to be sorted, because BY-group processing uses dynamic WHERE processing to find an observation in the master data set.
The DATA step processes duplicate observations in the following ways:
If duplicate BY values exist in the master data set, then MODIFY applies the current transaction to the first occurrence in the master data set.
If duplicate BY values exist in the transaction data set, then the observations are applied one on top of another so that the values overwrite each other. The value in the last transaction is the final value in the master data set.
If both the master and the transaction data sets contain duplicate BY values, then MODIFY applies each transaction to the first occurrence in the group in the master data set.
The Program |
The program in this section updates the master data set INVENTORY_2 with observations from the transaction data set ADD_INVENTORY_2. Both data sets contain consecutive and nonconsecutive duplicate values of the BY variable PartNumber.
The following program creates the master data set INVENTORY_2. Note that the data set contains three observations for PartNumber M4J7.
data inventory_2; input PartNumber $ Description $ InStock @17 ReceivedDate date9. @27 Price; format ReceivedDate date9.; datalines; K89R seal 34 27jul1998 245.00 M4J7 sander 98 20jun1998 45.88 M4J7 sander 98 20jun1998 45.88 LK43 filter 121 19may1999 10.99 MN21 brace 43 10aug1999 27.87 M4J7 sander 98 20jun1998 45.88 BC85 clamp 80 16aug1999 9.55 NCF3 valve 198 20mar1999 24.50 KJ66 cutter 6 18jun1999 19.77 ;
The following program creates the transaction data set ADD_INVENTORY_2, and then modifies the master data set INVENTORY_2. Note that the data set ADD_INVENTORY_2 contains three observations for PartNumber M4J7.
options pagesize=60 linesize=80 pageno=1 nodate; data add_inventory_2; input PartNumber $ Description $ NewStock; datalines; K89R abc 17 M4J7 def 72 M4J7 ghi 66 LK43 jkl 311 M4J7 mno 43 BC85 pqr 75 ; data inventory_2; modify inventory_2 add_inventory_2; by PartNumber; ReceivedDate=today(); InStock=InStock+NewStock; run; proc print data=inventory_2; title "Tool Warehouse Inventory"; run;
The following output shows the results:
The Updated INVENTORY_2 Data Set: Duplicate BY Variables
Tool Warehouse Inventory 1 Part In Received Obs Number Description Stock Date Price 1 K89R abc 51 22JAN2001 245.00 2 M4J7 mno 279 22JAN2001 45.88 3 M4J7 sander 98 20JUN1998 45.88 4 LK43 jkl 432 22JAN2001 10.99 5 MN21 brace 43 10AUG1999 27.87 6 M4J7 sander 98 20JUN1998 45.88 7 BC85 pqr 155 22JAN2001 9.55 8 NCF3 valve 198 20MAR1999 24.50 9 KJ66 cutter 6 18JUN1999 19.77
Copyright © 2012 by SAS Institute Inc., Cary, NC, USA. All rights reserved.