COMPARE Procedure

Example 5: Comparing Observations with an ID Variable


ID statement

Data sets: PROCLIB.EMP95



In this example, PROC COMPARE compares only the observations that have matching values for the ID variable.


libname proclib 'SAS-library';
options nodate pageno=1 linesize=80 pagesize=40;
data proclib.emp95;
   input #1 idnum $4. @6 name $15.
         #2 address $42.
         #3 salary 6.;
2388 James Schmidt
100 Apt. C Blount St. SW Raleigh NC 27693
2457 Fred Williams
99 West Lane  Garner NC 27509
... more data lines...
3888 Kim Siu
5662 Magnolia Blvd Southeast Cary NC 27513

data proclib.emp96;
   input #1 idnum $4. @6 name $15.
         #2 address $42.
         #3 salary 6.;
2388 James Schmidt
100 Apt. C Blount St. SW Raleigh NC 27693
2457 Fred Williams
99 West Lane  Garner NC 27509
...more data lines...
6544 Roger Monday
3004 Crepe Myrtle Court Raleigh NC 27604
proc sort data=proclib.emp95 out=emp95_byidnum;

 by idnum;

proc sort data=proclib.emp96 out=emp96_byidnum;
   by idnum;
proc compare base=emp95_byidnum compare=emp96_byidnum;
   id idnum;
   title 'Comparing Observations that Have Matching IDNUMs';

Program Description

Declare the PROCLIB SAS library.
libname proclib 'SAS-library';
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=80 pagesize=40;
Create the PROCLIB.EMP95 and PROCLIB.EMP96 data sets. PROCLIB.EMP95 and PROCLIB.EMP96 contain employee data. IDNUM works well as an ID variable because it has unique values. The first DATA step creates PROCLIB.EMP95. The second DATA step creates PROCLIB.EMP96.
data proclib.emp95;
   input #1 idnum $4. @6 name $15.
         #2 address $42.
         #3 salary 6.;
2388 James Schmidt
100 Apt. C Blount St. SW Raleigh NC 27693
2457 Fred Williams
99 West Lane  Garner NC 27509
... more data lines...
3888 Kim Siu
5662 Magnolia Blvd Southeast Cary NC 27513

data proclib.emp96;
   input #1 idnum $4. @6 name $15.
         #2 address $42.
         #3 salary 6.;
2388 James Schmidt
100 Apt. C Blount St. SW Raleigh NC 27693
2457 Fred Williams
99 West Lane  Garner NC 27509
...more data lines...
6544 Roger Monday
3004 Crepe Myrtle Court Raleigh NC 27604
Sort the data sets by the ID variable. Both data sets must be sorted by the variable that will be used as the ID variable in the PROC COMPARE step. OUT= specifies the location of the sorted data.
proc sort data=proclib.emp95 out=emp95_byidnum;

 by idnum;

proc sort data=proclib.emp96 out=emp96_byidnum;
   by idnum;
Create a summary report that compares observations with matching values for the ID variable. The ID statement specifies IDNUM as the ID variable.
proc compare base=emp95_byidnum compare=emp96_byidnum;
   id idnum;
   title 'Comparing Observations that Have Matching IDNUMs';
PROC COMPARE identifies specific observations by the value of IDNUM. In the Value Comparison Results for Variables section, PROC COMPARE prints the nonmatching addresses and nonmatching salaries. For salaries, PROC COMPARE computes the numerical difference and the percent difference. Because ADDRESS is a character variable, PROC COMPARE displays only the first 20 characters. For addresses where the observation has an IDNUM of 0987, 2776, or 3888, the differences occur after the 20th character and the differences do not appear in the output. The plus sign in the output indicates that the full value is not shown. To see the entire value, create an output data set. See Comparing Values of Observations Using an Output Data Set (OUT=).
Part One of Comparing Observations that Have Matching IDNUMs
Part Two of Comparing Observations that Have Matching IDNUMs
Part Three of Comparing Observations that Have Matching IDNUMs