![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
In a one-to-many merge with common variables that are not the BY variables, customers sometimes expect that the value for the common variable from the "one" data set will be retained throughout the BY group if the "one" data set is listed second in the MERGE statement. It is correct that in a one-to-one merge, and for the first matching observation in a one-to-many merge, the value of a common variable in the latter data set(s) overwrites the value from the previous data set. However, on subsequent iterations of the MERGE statement for the same BY group, the "one" data set is not read again. Therefore, the resulting output contains the value of the variable from the "many" data set.
Consider the following two data sets in which we want to use the NAMES data set (the "one" data set since it has one observation per ID) to correct the spelling of the names in the SCORES data set (which is the "many" data set since it has multiple observations per ID.)
SCORES NAMES id name score id name 1 Jo 100 1 Joanna 1 Jo 99 2 Jon 1 Jo 90 2 John 89 2 John 92
Performing a merge by ID as illustrated in the following statements generates the results shown below:
merge scores names; by id; Obs id name score 1 1 Joanna 100 2 1 Jo 99 3 1 Jo 90 4 2 Jon 89 5 2 John 92
As you can see, NAME is not updated for all observations with the same ID. When SAS compiles the program, it creates the Program Data Vector (PDV), which lists all of the variables and contains their values throughout the DATA step. During the merge, SAS first writes "Jo" into the variable NAME since it reads the data set SCORES first. Then the value is overwritten by the value "Joanna" from NAMES since that data set comes later in the MERGE statement. At the end of the first iteration of the merge, SAS outputs the current values of the PDV, which includes "Joanna" for NAME.
On the next iteration of the DATA step, SAS reads the second observation from the SCORES data set. It writes "Jo" into the PDV for NAME. The data set NAMES is not read again since there are no more observations for that BY group. Thus, at the end of the second iteration of the DATA step, the value "Jo" is output for NAME.
To replace NAME for all observations within the BY group, the variable from the "many" data set (SCORES) should be dropped or renamed. In that case, SAS reads the value from the "one" data set (NAMES) on the first iteration of the merge. Since variables that are read from data sets are automatically retained throughout the BY group, all observations for that BY group have the value from the "one" data set.
Click the Full Code tab to see complete examples of merging with and without common variables.
For more information about the PDV, DATA step processing, and combining data sets, see SAS Language Reference: Concepts.
Missing values in a one-to-many merge behave the same way as nonmissing values. For more information, see Usage Note 41064: "Missing values in common variables in a one-to-many MERGE can overwrite nonmissing values."
Product Family | Product | System | SAS Release | |
Reported | Fixed* | |||
SAS System | Base SAS | z/OS | ||
Z64 | ||||
OpenVMS VAX | ||||
Microsoft® Windows® for 64-Bit Itanium-based Systems | ||||
Microsoft Windows Server 2003 Datacenter 64-bit Edition | ||||
Microsoft Windows Server 2003 Enterprise 64-bit Edition | ||||
Microsoft Windows XP 64-bit Edition | ||||
Microsoft® Windows® for x64 | ||||
OS/2 | ||||
Microsoft Windows 8 Pro | ||||
Microsoft Windows 95/98 | ||||
Microsoft Windows 2000 Advanced Server | ||||
Microsoft Windows 2000 Datacenter Server | ||||
Microsoft Windows 2000 Server | ||||
Microsoft Windows 2000 Professional | ||||
Microsoft Windows NT Workstation | ||||
Microsoft Windows Server 2003 Datacenter Edition | ||||
Microsoft Windows Server 2003 Enterprise Edition | ||||
Microsoft Windows Server 2003 Standard Edition | ||||
Microsoft Windows Server 2003 for x64 | ||||
Microsoft Windows Server 2008 | ||||
Microsoft Windows Server 2008 for x64 | ||||
Microsoft Windows Server 2012 | ||||
Microsoft Windows XP Professional | ||||
Windows 7 Enterprise 32 bit | ||||
Windows 7 Enterprise x64 | ||||
Windows 7 Home Premium 32 bit | ||||
Windows 7 Home Premium x64 | ||||
Windows 7 Professional 32 bit | ||||
Windows 7 Professional x64 | ||||
Windows 7 Ultimate 32 bit | ||||
Windows 7 Ultimate x64 | ||||
Windows Millennium Edition (Me) | ||||
Windows Vista | ||||
Windows Vista for x64 | ||||
64-bit Enabled AIX | ||||
64-bit Enabled HP-UX | ||||
64-bit Enabled Solaris | ||||
AIX | ||||
HP-UX | ||||
HP-UX IPF | ||||
IRIX | ||||
Linux | ||||
Linux for x64 | ||||
Linux on Itanium | ||||
OpenVMS Alpha | ||||
OpenVMS on HP Integrity | ||||
Solaris | ||||
Solaris for x64 | ||||
Tru64 UNIX |