The content of the
permanent control data sets is based on intervals and ranges in your
data. The RANGES= and INT= parameters work together to flag datetime
gaps. When you specify the INT= parameter on %RMDUPCHK, you define
the maximum gap allowable between records in the same range. If the
gap between datetimes for two consecutive records exceeds this value,
then a new range is created. When you specify the RANGES= parameter
on %RMDUPCHK, you define the maximum number of ranges that are allowed
to be in the raw data during the current ETL process or job. If the
raw data exceeds the maximum number of ranges, then processing stops
and you receive an error message.
A range is deleted when
the end-of-range datetime value is older than the number of weeks
that are specified on the KEEP= parameter on %RMDUPCHK. However,
if your data is continuous, then you have only one range. Your control
information is never aged out because the end-of-range datetime value
is constantly extended by new datetime information.
Here are the ways that
ranges are used with data that is continuous and data that is not
continuous:
-
If your data is continuous and
does not have any datetime gaps that exceed the value of the INT=
parameter, then your data always updates the same range. In this
case, the permanent control data set contains one range for each unique
value of the variable that is specified by the IDVAR= parameter. The
values of that variable are typically the machine or system names
from which the raw data originated.
-
If your data is not continuous,
then the permanent control data set contains multiple ranges for each
unique value of the variable that is specified by the IDVAR= parameter.
Each range is prefixed with a value of the variable that is specified
by the IDVAR= parameter.
Note: For the HP Reporter and VMware
adapters, one set of control data sets is created for each table by
default. You can create just one set of control data sets for these
adapters. To do so, specify a three-character identifier in the SOURCE=
parameter of the %RMDUPCHK macro. SAS IT Resource Management then
prefixes that identifier to the control data set.