Sample 24752: Perform a many-to-many merge based on common variables
The sample code on the Full Code tab illustrates how to combine multiple data sets based on common variables when duplicates exist in more than one data set.
Note:
In this scenario, a NOTE like the following will be issued in the SAS® log:
NOTE: MERGE statement has more than one data set with repeats of BY values.
This message is intended to be informational in nature.
A DATA step that performs a many-to-many merge does not produce a Cartesian product across a BY-group. Understanding the nature of a many-to-many merge will ensure that you choose the correct method to combine your data to obtain the desired results.
These sample files and code examples are provided by SAS Institute
Inc. "as is" without warranty of any kind, either express or implied, including
but not limited to the implied warranties of merchantability and fitness for a
particular purpose. Recipients acknowledge and agree that SAS Institute shall
not be liable for any damages whatsoever arising out of their use of this material.
In addition, SAS Institute will provide no support for the materials contained herein.
/********************************************************************************/
/* WORK.ONE has duplicates for A and C. WORK.TWO has duplicates for B and C. */
/* If your data is not in sorted order, a PROC SORT or index on the BY */
/* variables would be necessary to perform the many-to-many merge. */
/* */
/* Note, when a BY statement is used with the MERGE statement, variables do not */
/* reinitialize to missing until the BY-Group changes. */
/********************************************************************************/
data one;
input id $ fruit $12.;
datalines;
a apple
a apricot
b banana
c coconut
c cherry
c crabapple
;
data two;
input id $ color $;
datalines;
a amber
b brown
b black
c cocoa
c cream
;
/* Note the different values for COLOR for BY-Group 'c'. The value */
/* 'cocoa' is overwritten in the PDV with 'cream' when the second */
/* observation in the BY-Group is read from WORK.TWO and carries */
/* down the rest of the BY-Group. */
data both;
merge one two;
by id;
run;
proc print data=both;
run;
These sample files and code examples are provided by SAS Institute
Inc. "as is" without warranty of any kind, either express or implied, including
but not limited to the implied warranties of merchantability and fitness for a
particular purpose. Recipients acknowledge and agree that SAS Institute shall
not be liable for any damages whatsoever arising out of their use of this material.
In addition, SAS Institute will provide no support for the materials contained herein.
Obs id fruit color
1 a apple amber
2 a apricot amber
3 b banana brown
4 b banana black
5 c coconut cocoa
6 c cherry cream
7 c crabapple cream
This sample illustrates how to combine multiple data sets based on common variables when duplicates exist in more than one data set.
Type: | Sample |
Topic: | SAS Reference ==> DATA Step SAS Reference ==> Statements ==> File-handling ==> MERGE Common Programming Tasks ==> Combining Data SAS Reference ==> Statements ==> File-handling ==> MERGE ==> with BY
|
Date Modified: | 2010-02-15 14:35:28 |
Date Created: | 2004-09-30 14:09:10 |
Operating System and Release Information
SAS System | Base SAS | All | n/a | n/a |