- Syntax
- Overview
- Concepts
- Results
- Examples Producing a Complete Report of the DifferencesComparing Variables in Different Data SetsComparing a Variable Multiple TimesComparing Variables That Are in the Same Data SetComparing Observations with an ID VariableComparing Values of Observations Using an Output Data Set (OUT=)Creating an Output Data Set of Statistics (OUTSTATS=)

Compares the contents of two SAS data sets, selected
variables in different data sets, or variables within the same data
set.

Restrictions: | If you omit COMPARE=, then you must use the WITH and
VAR statements.
PROC COMPARE reports errors differently if one or both of the compared data sets are not RADIX addressable. Version 6 compressed files are not RADIX addressable, while, beginning with Version 7, compressed files are RADIX addressable. (The integrity of the data is not compromised; the procedure simply numbers the observations differently.) |

Tip: | You can use data set options with the BASE= and COMPARE= options. |

Examples: | Producing a Complete Report of the Differences Comparing Variables in Different Data Sets Comparing Variables That Are in the Same Data Set Comparing Values of Observations Using an Output Data Set (OUT=) |

PROC COMPARE <option(s)>;

Control the details in the default report

includes the values for all matching observations.

prints a table of summary statistics for all pairs
of matching variables.

includes in the report the values and differences
for all matching variables.

prints only a short comparison summary.

changes the report for numbers between 0 and 1.

restricts the number of differences to print.

suppresses the print of creation and last-modified
dates.

suppresses all printed output.

suppresses the data set, variable, observation,
and values comparison summary reports.

suppresses the report of the value comparison results.

produces a complete listing of values and differences.

prints a table of summary statistics for all pairs
of matching numeric variables that are judged unequal.

prints the reports of value differences by observation
instead of by variable.

Control the listing of variables and observations

lists all variables and observations that are found
in only one data set.

lists all variables and observations found only
in the base data set.

lists all observations found only in the base data
set.

lists all variables found in only one data set.

lists all variables and observations found only
in the comparison data set.

lists all observations found only in the comparison
data set.

lists all variables found only in the comparison
data set.

lists variables whose values are judged equal.

lists all observations found in only one data set.

list all variables found in only one data set.

Control the output data set

creates an output data set.

writes an observation for each observation in the
BASE= and COMPARE= data sets.

writes an observation for each observation in the
base data set.

writes an observation for each observation in the
comparison data set

writes an observation to the output data set for
each pair of matching observations.

suppresses the writing of observations when all
values are equal.

writes an observation to the output data set for
each pair of matching observations.

Create an output data set that contains summary statistics

writes summary statistics for all pairs of matching
variables to the specified SAS-data-set.

Display a warning message in the SAS log

displays a warning message in the SAS log when differences
are found.

Display an error message in the SAS log

displays an error message in the SAS log when differences
are found.

Specify how the values are compared

specifies the criterion for judging the equality
of numeric values.

specifies the method for judging the equality of
numeric values.

judges a missing value in the base data set equal
to any value.

judges a missing value in the comparison data set
equal to any value.

judges missing values in both the base and comparison
data sets equal to any value.

Specify the data sets to compare

specifies the data set to use as the base data set.

specifies the data set to use as the comparison
data set.

Write notes to the SAS log

displays notes in the SAS log that describe the
results of the comparison.

- ALLOBS
- includes in the report
of value comparison results the values and, for numeric variables,
the differences for all matching observations, even if they are judged
equal.Default:If you omit ALLOBS, then PROC COMPARE prints values only for observations that are judged unequal.Interaction:When used with the TRANSPOSE option, ALLOBS invokes the ALLVARS option and displays the values for all matching observations and variables.

- ALLSTATS
- prints a table of summary statistics for all pairs
of matching variables. See:Table of Summary Statistics for information about the statistics produced

- ALLVARS
- includes in the report
of value comparison results the values and, for numeric variables,
the differences for all pairs of matching variables, even if they
are judged equal.Default:If you omit ALLVARS, then PROC COMPARE prints values only for variables that are judged unequal.Interaction:When used with the TRANSPOSE option, ALLVARS displays unequal values in context with the values for other matching variables. If you omit the TRANSPOSE option, then ALLVARS invokes the ALLOBS option and displays the values for all matching observations and variables.

- BASE=SAS-data-set
- specifies the data set to use as the base data set. Alias:DATA=Default:the most recently created SAS data setTip:You can use the WHERE= data set option with the BASE= option to limit the observations that are available for comparison.

- BRIEFSUMMARY
- produces a short comparison
summary and suppresses the four default summary reports (data set
summary report, variables summary report, observation summary report,
and values comparison summary report).Alias:BRIEFTip:By default, a listing of value differences accompanies the summary reports. To suppress this listing, use the NOVALUES option.

- COMPARE=SAS-data-set
- specifies the data set to use as the comparison
data set. Alias:COMP=, C=Default:If you omit COMPARE=, then the comparison data set is the same as the base data set, and PROC COMPARE compares variables within the data set.Restriction:If you omit COMPARE=, then you must use the WITH statement.Tip:You can use the WHERE= data set option with COMPARE= to limit the observations that are available for comparison.

- CRITERION=γ
- specifies the criterion for judging the equality
of numeric values. Normally, the value
of γ (gamma) is positive. In that case, the number itself becomes
the equality criterion. If you use a negative value for γ,
then PROC COMPARE uses an equality criterion proportional to the precision
of the computer on which SAS is running.Default:0.00001

- ERROR
- displays an error message in the SAS log when differences
are found. Interaction:This option overrides the WARNING option.

- FUZZ=number
- alters the values comparison
results for numbers less than number.
PROC COMPARE prints Default:0Range:0 - 1Tip:A report that contains many trivial differences is easier to read in this form.

- LISTALL
- lists all variables and observations that are found
in only one data set. Alias:LISTInteraction:using LISTALL is equivalent to using the following four options: LISTBASEOBS, LISTCOMPOBS, LISTBASEVAR, and LISTCOMPVAR.

- LISTBASE
- lists all observations
and variables that are found in the base data set but not in the comparison
data set.Interaction:Using LISTBASE is equivalent to using the LISTBASEOBS and LISTBASEVAR options.

- LISTBASEOBS
- lists all observations that are found in the base data set but not in the comparison data set.

- LISTBASEVAR
- lists all variables that are found in the base data set but not in the comparison data set.

- LISTCOMP
- lists all observations
and variables that are found in the comparison data set but not in
the base data set.Interaction:Using LISTCOMP is equivalent to using the LISTCOMPOBS and LISTCOMPVAR options.

- LISTCOMPOBS
- lists all observations that are found in the comparison data set but not in the base data set.

- LISTCOMPVAR
- lists all variables that are found in the comparison data set but not in the base data set.

- LISTEQUALVAR
- prints a list of variables whose values are judged equal at all observations in addition to the default list of variables whose values are judged unequal.

- LISTOBS
- lists all observations
that are found in only one data set.Interaction:Using LISTOBS is equivalent to using the LISTBASEOBS and LISTCOMPOBS options.

- LISTVAR
- lists all variables
that are found in only one data set.Interaction:Using LISTVAR is equivalent to using both the LISTBASEVAR and LISTCOMPVAR options.

- MAXPRINT=total | (per-variable, total)
- specifies the maximum number of differences to print, where

- METHOD=ABSOLUTE | EXACT | PERCENT | RELATIVE<(δ)>
- specifies the method for judging the equality of numeric values. The constant δ (delta) is a number between 0 and 1 that specifies a value to add to the denominator when calculating the equality measure. By default, δ is 0.

- NODATE
- suppresses the display in the data set summary report of the creation dates and the last modified dates of the base and comparison data sets.

- NOMISSBASE
- (By default, a missing
value is equal only to a missing value of the same kind, that is .=.,
.^=.A, .A=.A, .A^=.B, and so on.)You can use this option to determine the changes that would be made to the observations in the comparison data set if it were used as the master data set and the base data set were used as the transaction data set in a DATA step UPDATE statement. For information about the UPDATE statement, see the chapter on SAS language statements in SAS System Options: Reference.

- NOMISSCOMP
- judges a missing value in the comparison data set
equal to any value. (By default, a missing
value is equal only to a missing value of the same kind, that is .=.,
.^=.A, .A=.A, .A^=.B, and so on.)You can use this option to determine the changes that would be made to the observations in the base data set if it were used as the master data set and the comparison data set were used as the transaction data set in a DATA step UPDATE statement. For information about the UPDATE statement, see the chapter on SAS language statements in SAS System Options: Reference.

- NOMISSING
- judges missing values in both the base and comparison
data sets equal to any value. By default, a missing
value is equal only to a missing value of the same kind, that is .=.,
.^=.A, .A=.A, .A^=.B, and so on.Alias:NOMISSInteraction:Using NOMISSING is equivalent to using both NOMISSBASE and NOMISSCOMP.

- NOPRINT
- suppresses all printed output. Tip:You may want to use this option when you are creating one or more output data sets.

- NOSUMMARY
- suppresses the data set, variable, observation,
and values comparison summary reports. Tip:NOSUMMARY produces no output if there are no differences in the matching values.

- NOTE
- displays notes in the SAS log that describe the results of the comparison, if differences were found.

- NOVALUES
- suppresses the report of the value comparison results. Example:Overview: COMPARE Procedure

- OUT=SAS-data-set
- names the output data set. If SAS-data-set does not exist, then PROC COMPARE creates it. SAS-data-set contains the differences between matching variables.

- OUTALL
- writes an observation
to the output data set for each observation in the base data set and
for each observation in the comparison data set. The option also writes
observations to the output data set that contains the differences
and percent differences between the values in matching observations.Tip:Using OUTALL is equivalent to using the following four options: OUTBASE, OUTCOMP, OUTDIF, and OUTPERCENT.

- OUTBASE
- writes an observation to the output data set for each observation in the base data set, creating observations in which _TYPE_=BASE.

- OUTCOMP
- writes an observation to the output data set for each observation in the comparison data set, creating observations in which _TYPE_=COMP.

- OUTDIF
- writes an observation to the output data set for
each pair of matching observations. The values in the observation
include values for the differences between the values in the pair
of observations. The value of _TYPE_ in each observation is DIF.Default:The OUTDIF option is the default unless you specify the OUTBASE, OUTCOMP, or OUTPERCENT option. If you use any of these options, then you must specify the OUTDIF option to create _TYPE_=DIF observations in the output data set.

- OUTNOEQUAL
- suppresses the writing of an observation to the output data set when all values in the observation are judged equal. In addition, in observations containing values for some variables judged equal and others judged unequal, the OUTNOEQUAL option uses the special missing value ".E" to represent differences and percent differences for variables judged equal.

- OUTPERCENT
- writes an observation to the output data set for each pair of matching observations. The values in the observation include values for the percent differences between the values in the pair of observations. The value of _TYPE_ in each observation is PERCENT.

- OUTSTATS=SAS-data-set
- writes summary statistics for all pairs of matching
variables to the specified SAS-data-set. Tip:If you want to print a table of statistics in the procedure output, then use the STATS, ALLSTATS, or PRINTALL option.

- STATS
- prints a table of summary statistics for all pairs
of matching numeric variables that are judged unequal. See:Table of Summary Statistics for information about the statistics produced.