Previous Page | Next Page

Statements

MERGE Statement



Joins observations from two or more SAS data sets into a single observation.
Valid: in a DATA step
Category: File-handling
Type: Executable

Syntax
Arguments
Details
Overview
Using Data Set Lists with MERGE
One-to-One Merging
Match-Merging
Comparisons
Examples
Example 1: One-to-One Merging
Example 2: Match-Merging
Example 3: Merging with a Data Set List
See Also

Syntax

MERGE SAS-data-set-1 <(data-set-options)>
SAS-data-set-2 <(data-set-options) >
<...SAS-data-set-n<(data-set-options)>>
<END=variable>;


Arguments

SAS-data-set

specifies at least two existing SAS data sets from which observations are read. You can specify individual data sets, data set lists, or a combination of both.

Tip: You can specify additional SAS data sets.
See: Using Data Set Lists with MERGE
(data-set-options)

specifies one or more SAS data set options in parentheses after a SAS data set name.

Explanation: The data set options specify actions that SAS is to take when it reads observations into the DATA step for processing. For a list of data set options, see Data Set Options by Category.
Tip: Data set options that apply to a data set list apply to all of the data sets in the list.
END=variable

names and creates a temporary variable that contains an end-of-file indicator.

Explanation: The variable, which is initialized to 0, is set to 1 when the MERGE statement processes the last observation. If the input data sets have different numbers of observations, the END= variable is set to 1 when MERGE processes the last observation from all data sets.
Tip: The END= variable is not added to any SAS data set that is being created.

Details


Overview

The MERGE statement is flexible and has a variety of uses in SAS programming. This section describes basic uses of MERGE. Other applications include using more than one BY variable, merging more than two data sets, and merging a few observations with all observations in another data set.

For more information, see How to Prepare Your Data Sets in SAS Language Reference: Concepts.


Using Data Set Lists with MERGE

You can use data set lists with the MERGE statement. Data set lists provide a quick way to reference existing groups of data sets. These data set lists must be either name prefix lists or numbered range lists.

Name prefix lists refer to all data sets that begin with a specified character string. For example, merge SALES1:; tells SAS to merge all data sets starting with "SALES1" such as SALES1, SALES10, SALES11, and SALES12.

Numbered range lists require you to have a series of data sets with the same name, except for the last character or characters, which are consecutive numbers. In a numbered range list, you can begin with any number and end with any number. For example, these lists refer to the same data sets:

sales1 sales2 sales3 sales4

sales1-sales4

Note:   If the numeric suffix of the first data set name contains leading zeros, the number of digits in the numeric suffix of the last data set name must be greater than or equal to the number of digits in the first data set name. Otherwise, an error will occur. For example, the data set lists sales001-sales99 and sales01-sales9 will cause an error. The data set list sales001-sales999 is valid. If the numeric suffix of the first data set name does not contain leading zeros, the number of digits in the numeric suffix of the first and last data set names do not have to be equal. For example, the data set list sales1-sales999 is valid.  [cautionend]

Some other rules to consider when using numbered data set lists are as follows:


One-to-One Merging

One-to-one merging combines observations from two or more SAS data sets into a single observation in a new data set. To perform a one-to-one merge, use the MERGE statement without a BY statement. SAS combines the first observation from all data sets that are named in the MERGE statement into the first observation in the new data set, the second observation from all data sets into the second observation in the new data set, and so on. In a one-to-one merge, the number of observations in the new data set is equal to the number of observations in the largest data set named in the MERGE statement. See Example 1 for an example of a one-to-one merge. For more information, see Reading, Combining, and Modifying SAS Data Sets in SAS Language Reference: Concepts.

CAUTION:
Use care when you combine data sets with a one-to-one merge.

One-to-one merges can sometimes produce undesirable results. Test your program on representative samples of the data sets before you use this method.  [cautionend]


Match-Merging

Match-merging combines observations from two or more SAS data sets into a single observation in a new data set according to the values of a common variable. The number of observations in the new data set is the sum of the largest number of observations in each BY group in all data sets. To perform a match-merge, use a BY statement immediately after the MERGE statement. The variables in the BY statement must be common to all data sets. Only one BY statement can accompany each MERGE statement in a DATA step. The data sets that are listed in the MERGE statement must be sorted in order of the values of the variables that are listed in the BY statement, or they must have an appropriate index. See Example 2 for an example of a match-merge. For more information, see Reading, Combining, and Modifying SAS Data Sets in SAS Language Reference: Concepts.

Note:   The MERGE statement does not produce a Cartesian product on a many-to-many match-merge. Instead it performs a one-to-one merge while there are observations in the BY group in at least one data set. When all observations in the BY group have been read from one data set and there are still more observations in another data set, SAS performs a one-to-many merge until all observations have been read for the BY group.  [cautionend]


Comparisons


Examples


Example 1: One-to-One Merging

This example shows how to combine observations from two data sets into a single observation in a new data set:

data benefits.qtr1;
   merge benefits.jan benefits.feb;
run;


Example 2: Match-Merging

This example shows how to combine observations from two data sets into a single observation in a new data set according to the values of a variable that is specified in the BY statement:

data inventry;
   merge stock orders;
   by partnum;
run;


Example 3: Merging with a Data Set List

This example uses a data list to define the data sets that are merged.

data d008; job=3; emp=19; run; 
data d009; job=3; sal=50; run; 
data d010; job=4; emp=97; run; 
data d011; job=4; sal=15; run; 
data comb;
merge d008-d011;
by job;
run;
proc print data=comb;
run;


See Also

Statements:

BY Statement

MODIFY Statement

SET Statement

UPDATE Statement

Reading, Combining , Modifying and SAS Data Sets in SAS Language Reference: Concepts

Previous Page | Next Page | Top of Page