DQMATCH Procedure

PROC DQMATCH Statement

Create match-codes as a basis for standardization or transformation.

Syntax

Optional Arguments

CLUSTER=variable-name
specifies the name of the numeric variable in the output data set that contains the cluster number.
Interaction:If the CLUSTER= option is not specified and if the CLUSTERS_ONLY option is specified, an output variable named CLUSTER is created.
CLUSTER_BLANKS | NO_CLUSTER_BLANKS
specifies how to process blank values.
CLUSTER_BLANKS
specifies that blank values are written to the output data set. The blank values do not have accompanying match codes.
NO_CLUSTER_BLANKS
specifies that blank values are not written to the output data set.
Default:CLUSTER_BLANKS
CLUSTERS_ONLY
specifies that input character values that are part of a cluster are written to the output data set. Excludes input character values that are not part of a cluster.
Default:This option is not asserted by default. Typically, all input values are included in the output data set.
Note:A cluster number is assigned only when two or more input values produce the same match–code.
DATA= data-set-name
specifies the name of the input SAS data set.
Default:The most recently created data set in the current SAS session.
DELIMITER | NODELIMITER
specifies whether exclamation points (!) are used as delimiters.
DELIMITER
when multiple CRITERIA statements are specified, DELIMITER specifies that exclamation points (!) separate the individual match codes that make up the concatenated match code. Match codes are concatenated in the order of appearance of CRITERIA statements in the DQMATCH procedure.
NODELIMITER
specifies that multiple match codes are concatenated without exclamation point delimiters.
Default:(SAS) uses a delimiter.

(DataFlux Data Management Studio) does not use a delimiter.

Note:Be sure to use delimiters consistently if you plan to analyze, compare, or combine match codes created in SAS and in DataFlux Data Management Studio.
LOCALE= locale-name
specifies the name of the locale that is used to create match codes. The locale-name can be a name in quotation marks, or an expression that evaluates to a locale-name. It can also be the name of a variable whose value is a locale-name.
The specified locale must be loaded into memory as part of the locale list.
Default:The first locale name in the locale list.
Restriction:If no locale-name is specified, the first locale in the locale list is used.
Note:The match definition, which is part of a locale, is specified in the CRITERIA statement. This specification allows different match definitions to be applied to different variables in the same procedure.
MATCHCODE= character-variable
specifies the name of the output character variable that stores the match codes. The DQMATCH procedure defines a sufficient length for this variable, even if a variable with the same name exists in the input data set.
MATCH_CD is created if the following statements are all true:
  • The MATCHCODE= option is not specified in the DQMATCH procedure.
  • The MATCHCODE= option is not specified in subsequent CRITERIA statements.
  • The CLUSTER= option is not specified.
  • The CLUSTERS_ONLY= option is not specified.
OUT= output-data-set
specifies the name of the output data set for match codes created with the DQMATCH procedure. The DQMATCH procedure creates match codes for specified character variables in an input data set.
Note:If the specified output data set does not exist, the DQMATCH procedure creates it.