COMPARE Procedure

Example 6: Comparing Values of Observations Using an Output Data Set (OUT=)

Features:
PROC COMPARE statement options:
NOPRINT
OUT=
OUTBASE
OUTBASE
OUTCOMP
OUTDIF
OUTNOEQUAL
Other features:

PRINT procedure

Data sets: PROCLIB.EMP95

PROCLIB.EMP96

Details

This example creates and prints an output data set that shows the differences between matching observations.
In Comparing Observations with an ID Variable, the output does not show the differences past the 20th character. The output data set in this example shows the full values. Further, it shows the observations that occur in only one of the data sets.

Program

libname proclib 'SAS-library';
options nodate pageno=1 linesize=120 pagesize=40;
proc sort data=proclib.emp95 out=emp95_byidnum;

 by idnum;
run;

proc sort data=proclib.emp96 out=emp96_byidnum;
   by idnum;
run;
proc compare base=emp95_byidnum compare=emp96_byidnum
             out=result outnoequal outbase outcomp outdif
     noprint;
   id idnum;
run;
proc print data=result noobs;
   by idnum;
   id idnum;
   title 'The Output Data Set RESULT';
run;

Program Description

Declare the PROCLIB SAS library.
libname proclib 'SAS-library';
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=120 pagesize=40;
Sort the data sets by the ID variable. Both data sets must be sorted by the variable that will be used as the ID variable in the PROC COMPARE step. OUT= specifies the location of the sorted data.
proc sort data=proclib.emp95 out=emp95_byidnum;

 by idnum;
run;

proc sort data=proclib.emp96 out=emp96_byidnum;
   by idnum;
run;
Specify the data sets to compare. BASE= and COMPARE= specify the data sets to compare.
proc compare base=emp95_byidnum compare=emp96_byidnum
Create the output data set RESULT and include all unequal observations and their differences. OUT= names and creates the output data set. NOPRINT suppresses the printing of the procedure output. OUTNOEQUAL includes only observations that are judged unequal. OUTBASE writes an observation to the output data set for each observation in the base data set. OUTCOMP writes an observation to the output data set for each observation in the comparison data set. OUTDIF writes an observation to the output data set that contains the differences between the two observations.
             out=result outnoequal outbase outcomp outdif
     noprint;
Specify the ID variable. The ID statement specifies IDNUM as the ID variable.
   id idnum;
run;
Print the output data set RESULT and use the BY and ID statements with the ID variable. PROC PRINT prints the output data set. Using the BY and ID statements with the same variable makes the output easy to read. See the PRINT procedure for more information about this technique.
proc print data=result noobs;
   by idnum;
   id idnum;
   title 'The Output Data Set RESULT';
run;
The differences for character variables are noted with an X or a period (.). An X shows that the characters do not match. A period shows that the characters do match. For numeric variables, an E means that there is no difference. Otherwise, the numeric difference is shown. By default, the output data set shows that two observations in the comparison data set have no matching observation in the base data set. You do not have to use an option to make those observations appear in the output data set.
Part One of The Output Data Set RESULT
Part Two of The Output Data Set RESULT