SUPPORT / SAMPLES & SAS NOTES
 

Support

Usage Note 48706: When the MERGE statement returns more observations than expected: Debugging possible causes

DetailsAboutRate It

If your MERGE statement returns more observations than you expect, check these possible causes and take the corresponding action(s) before performing the MERGE again.

One of the character BY variables has leading blanks.

The values "  Smith" and "Smith" do not match.
To eliminate leading blanks so that the values match, use the STRIP function on the variable in each data set.

Character BY variables have mismatching capitalization.

The values "SMITH" and "Smith" do not match.
To eliminate this issue, use one of the case functions (UPCASE, PROPCASE, LOWCASE) on the variable in each data set to make them the same case.

Character BY variables have a mismatching number of blank spaces within the values.

The values "John   Smith" and "John Smith" do not match.
To eliminate this issue, use the COMPBL function to compress multiple blank spaces into one blank space.

Note: All of these character functions can be nested and performed in one DATA step.

Numeric BY variables do not match.

If computations have been done on numbers, including SAS dates and SAS times, the numbers might not match exactly, due to numeric precision issues. To eliminate this issue, use the ROUND function on the numeric BY variable in each data set.

Unknowingly doing a one-to-many merge.

Each BY group in the resulting data set will have the same number of observations as the "many" data set. If you have duplicates in one of the data sets, you might want to use PROC SORT to identify the duplicates and then eliminate them.

Expecting the IN= data set option to be reset on each observation.

This option indicates whether a data set has contributed to the merge for each BY group, not for each observation. For more information about using the IN= option, see SAS Note 24681: "Merge data sets that have an uneven number of members in the BY groups and reset the remaining observations to 'missing'."



Operating System and Release Information

Product FamilyProductSystemSAS Release
ReportedFixed*
SAS SystemBase SASAster Data nCluster on Linux x64
DB2 Universal Database on AIX
DB2 Universal Database on Linux x64
Greenplum on Linux x64
Netezza TwinFin 32bit blade
Netezza TwinFin 32-bit SMP Hosts
Netezza TwinFin 64-bit S-Blades
Netezza TwinFin 64-bit SMP Hosts
Teradata on Linux
z/OS
Z64
OpenVMS VAX
Microsoft® Windows® for 64-Bit Itanium-based Systems
Microsoft Windows Server 2003 Datacenter 64-bit Edition
Microsoft Windows Server 2003 Enterprise 64-bit Edition
Microsoft Windows XP 64-bit Edition
Microsoft® Windows® for x64
OS/2
Microsoft Windows 8 Pro
Microsoft Windows 95/98
Microsoft Windows 2000 Advanced Server
Microsoft Windows 2000 Datacenter Server
Microsoft Windows 2000 Server
Microsoft Windows 2000 Professional
Microsoft Windows NT Workstation
Microsoft Windows Server 2003 Datacenter Edition
Microsoft Windows Server 2003 Enterprise Edition
Microsoft Windows Server 2003 Standard Edition
Microsoft Windows Server 2003 for x64
Microsoft Windows Server 2008
Microsoft Windows Server 2008 for x64
Microsoft Windows Server 2012
Microsoft Windows XP Professional
Windows 7 Enterprise 32 bit
Windows 7 Enterprise x64
Windows 7 Home Premium 32 bit
Windows 7 Home Premium x64
Windows 7 Professional 32 bit
Windows 7 Professional x64
Windows 7 Ultimate 32 bit
Windows 7 Ultimate x64
Windows Millennium Edition (Me)
Windows Vista
Windows Vista for x64
64-bit Enabled AIX
64-bit Enabled HP-UX
64-bit Enabled Solaris
ABI+ for Intel Architecture
AIX
HP-UX
HP-UX IPF
IRIX
Linux
Linux for x64
Linux on Itanium
OpenVMS Alpha
OpenVMS on HP Integrity
Solaris
Solaris for x64
Tru64 UNIX
* For software releases that are not yet generally available, the Fixed Release is the software release in which the problem is planned to be fixed.