Concatenating SAS Data Sets |
If two
data sets contain the same variables and the variables
possess the same attributes, then the file that results from concatenating
them with the SET statement is the same as the file that results from concatenating
them with the APPEND procedure. The APPEND procedure concatenates much faster
than the SET statement, particularly when the BASE= data set is large, because
the APPEND procedure does not process the observations from the BASE= data
set. However, the two methods of concatenating are sufficiently different
when the variables or their attributes differ between data sets. In this case,
you must consider the differences in behavior before you decide which method
to use.
The following table summarizes the major differences
between using the SET statement and using the APPEND procedure to concatenate
files.
Differences between the SET Statement and the APPEND Procedure
Criterion |
SET statement |
APPEND procedure |
Number of data sets that you can concatenate |
Uses any number of data sets. |
Uses two data sets. |
Handling of data sets that contain different variables |
Uses all variables and assigns missing values where
appropriate. |
Uses all variables in the BASE= data set and assigns
missing values to observations from the DATA= data set where appropriate.
Requires the FORCE option to concatenate data sets if the DATA= data set contains
variables that are not in the BASE= data set. Cannot include variables found
only in the DATA= data set when concatenating the data sets. |
Handling of different formats, informats, or labels |
Uses explicitly defined formats, informats, and labels
rather than defaults. If two or more data sets explicitly define the format,
informat, or label, then SAS uses the definition from the data set you name
first in the SET statement. |
Uses formats, informats, and labels from the BASE= data
set. |
Handling of different variable lengths |
If the same variable has a different length in two or
more data sets, then SAS uses the length from the data set you name first
in the SET statement. |
Requires the FORCE option if the length of a variable
is longer in the DATA= data set. Truncates the values of the variable to match
the length in the BASE= data set. |
Handling of different variable types |
Does not concatenate the data sets. |
Requires the FORCE option to concatenate data sets.
Uses the type attribute from the BASE= data set and assigns missing values
to the variable in observations from the DATA= data set. |
![](../../../../common/64368/HTML/default/images/spacer.gif) |
![](../../../../common/64368/HTML/default/images/spacer.gif) |
Copyright © 2012 by SAS Institute Inc., Cary, NC, USA. All rights reserved.