%RMDUPCHK

%RMDUPCHK Overview

The %RMDUPCHK macro checks for duplicate data and deletes it. It also builds up record counts of incoming and deleted data and datetime ranges for each system or machine. These record counts are stored in the control data set. If the control data set indicates that a gap was detected in the data, a report is generated.
The control data set is stored in the same library as the staged tables. This data set is created and managed by the %RMDUPxxx macros. (Users do not usually access this library.)
Note: For information about how to set up the %RMDUPCHK macro, see Implementing Duplicate-Data-Checking Macros. For information about how control data sets work, see Control Data Sets for Duplicate-Data Checking.

%RMDUPCHK Syntax

%RMDUPCHK(
ENDFILE=variable-name
,IDVAR=variable-name
,SOURCE=identifier
,TIMESTMP=timestamp-variable-name
<,FORCE=YES | NO>
<,INT=interval>
<,KEEP=number-of-weeks>
<,TERM=YES | NO>
);

%RMDUPCHK Required Arguments

ENDFILE=variable-name
specifies the name of the SAS variable that is used as the END= keyword for the SAS INFILE statement that reads the raw data.
IDVAR=variable-name
specifies the name of the SAS variable that identifies the system or machine that generated the input data.
SOURCE=identifier
specifies a unique three-character code that identifies the type of data.
TIMESTMP=timestamp-variable-name
specifies the name of the SAS variable that contains the datetime stamp that uniquely identifies the time of the event or interval being recorded.
The SOURCE entries for the supported adapters are listed in the following table.
Source Names for Each Adapter
ADAPTER
Value for the SOURCE Parameter for %RMDUPCHK
ASG TMON2CIC
TM2
ASG TMONDB2
TMD
BMC Mainview
IMF
BMC Perf Mgr
PAT
CA TMS
TMS
DT Perf Sentry
NTS
DT Perf Sentry with MXG
NTS
HP Perf Agent
(Multiple values are needed so that %RMDUPCHK can be invoked with each value.)
HP Reporter
(Multiple values are needed so that %RMDUPCHK can be invoked with each value.)
IBM DCOLLECT
DCO
IBM EREP
ERP
IBM IMS
IMS
IBM SMF
SMF
IBM TPF
TPF
IBM VMMON
VMM
MS SCOM
SCO
SAP ERP
BAT, SAP, and others. (Multiple values are needed so that %RMDUPCHK can be invoked with each value.)
SAR
SAR
MS SCOM
SCO
SNMP
SNM
VMware vCenter
(Multiple values are needed so that %RMDUPCHK can be invoked with each value.)
Web Log
WWW

%RMDUPCHK Options

FORCE=YES | NO
specifies whether duplicate input data should still be processed, or whether it is a duplicate.
  • FORCE=YES indicates that, if a duplicate is detected, the duplicate data should be processed.
  • FORCE=NO indicates that duplicate data should not be processed. The default value for this option is NO.
INT=interval
represents the maximum time gap (or interval) that is to be allowed between the timestamps on any two consecutive records from the same system or machine. If the interval between the timestamp values exceeds the value of this option, then an observation with the new time range is created in the control data set. This is referred to as a gap in the data.
The value for this option must be provided in the format hh:mm, where hh represents hours and mm represents minutes. For example, to specify an interval of 14 minutes, use INT=0:14. To specify an interval of 1 hour and 29 minutes, use INT=1:29.
The default value for this option is 0:29, or 29 minutes.
KEEP=number-of-weeks
specifies the number of weeks for which control data will be kept. Because this value represents the number of Sundays between two dates, a value of 2 (the default) results in a maximum retention period of 20 days.
The default value for this option is 2.
TERM=YES | NO
controls whether SAS terminates if any duplicate input data is detected.
The default value of this option is NO.

%RMDUPCHK Notes

The Adapter Setup wizard prompts the user to specify how to handle duplicate records. Valid entries for the mode of duplicate-data checking are Inactive, Discard, Force, or Terminate.
  • Inactive: Duplicate-data checking is not performed. No macros are executed.
  • Discard: Duplicate-data-checking macros are executed. FORCE=NO and TERM=NO are implied.
  • Force: Duplicate-data-checking macros are executed. FORCE=YES and TERM=NO are implied.
  • Terminate: Duplicate-data-checking macros are executed. FORCE=NO and TERM=YES are implied.
You can change the mode of duplicate-data-checking for a table on the Properties dialog box for that table.
Note: For information about how to set up the %RMDUPCHK macro, see Implementing Duplicate-Data-Checking Macros. For information about how control data sets work, see Control Data Sets for Duplicate-Data Checking.

%RMDUPCHK Example

The following example provides duplicate checking for the data that is input from the NTSMF adapter:
%rmdupchk(
         endfile=_eof
         ,source=nts
         ,idvar=machine
         ,int=00:18
         ,keep=52
         ,timestmp=datetime
         );