As raw data is being
read, one of the macros that performs duplicate-data checking reviews
the datetime information in each record and stores the information
in a SAS data set called a temporary control data set. Later, by
using intermediate control data sets, another macro merges the information
in the temporary control data set into one or more SAS data sets that
are called permanent control data sets.
When additional data
is processed into the IT data mart, the timestamps of the incoming
data are compared with the datetime information in the permanent control
data sets in order to determine whether the new data has already been processed.
If it has, the duplicate data is handled in the way that you specify.
A duplicate-data
report is printed in the SAS log after the data is read. The report
describes how many records were read for each machine or system and
how many duplicates were found, if any.
Note: The first time you use the
macros, the permanent control data sets have not been built, so the
macro %RMDUPCHK cannot check the input records. Your data is not checked
or rejected for duplicates, but the permanent control data sets are
created and the datetime information for this data is saved to them.
Data is checked only on datetime, although SMF data is also checked
for the system name. ( For example, if you try to add a new record
type, but you have already read other record types from that adapter
for that time period, the records will not be kept.) The duplicate-data
report contains only a limited amount of information about your data.