DQMATCH Procedure

PROC DQMATCH Statement

Create match-codes as a basis for standardization or transformation.

Syntax

Optional Arguments

CLUSTER=variable-name

specifies the name of the numeric variable in the output data set that contains the cluster number.

Interaction If the CLUSTER= option is not specified and if the CLUSTERS_ONLY option is specified, an output variable named CLUSTER is created.

CLUSTER_BLANKS | NO_CLUSTER_BLANKS

specifies how to process blank values.

CLUSTER_BLANKS

specifies that blank values are written to the output data set. The blank values do not have accompanying match codes.

NO_CLUSTER_BLANKS

specifies that blank values are not written to the output data set.

Default CLUSTER_BLANKS

CLUSTERS_ONLY

specifies that input character values that are part of a cluster are written to the output data set. Excludes input character values that are not part of a cluster.

Default This option is not asserted by default. Typically, all input values are included in the output data set.
Note A cluster number is assigned only when two or more input values produce the same match–code.

DATA= data-set-name

specifies the name of the input SAS data set.

Default The most recently created data set in the current SAS session.

DELIMITER | NODELIMITER

specifies whether exclamation points (!) are used as delimiters.

DELIMITER

when multiple CRITERIA statements are specified, DELIMITER specifies that exclamation points (!) separate the individual match codes that make up the concatenated match code. Match codes are concatenated in the order of appearance of CRITERIA statements in the DQMATCH procedure.

NODELIMITER

specifies that multiple match codes are concatenated without exclamation point delimiters.

Default (SAS) uses a delimiter.
(DataFlux Data Management Studio) does not use a delimiter.
Note Be sure to use delimiters consistently if you plan to analyze, compare, or combine match codes created in SAS and in DataFlux Data Management Studio.

LOCALE=locale-name

specifies the name of the locale that is used to create match codes. The locale-name can be a name in quotation marks, or an expression that evaluates to a locale-name. It can also be the name of a variable whose value is a locale-name.

The specified locale must be loaded into memory as part of the locale list. If you receive an out-of-memory error when you load the locale, you can increase the value in the MAXMEMQUERY system option. For more information, see your host-specific SAS 9.4 documentation, such as SAS Companion for Windows.
Default The first locale name in the locale list.
Restriction If no locale-name is specified, the first locale in the locale list is used.
Note The match definition, which is part of a locale, is specified in the CRITERIA statement. This specification allows different match definitions to be applied to different variables in the same procedure.

MATCHCODE=character-variable

specifies the name of the output character variable that stores the match codes. The DQMATCH procedure defines a sufficient length for this variable, even if a variable with the same name exists in the input data set.

MATCH_CD is created if the following statements are all true:
  • The MATCHCODE= option is not specified in the DQMATCH procedure.
  • The MATCHCODE= option is not specified in subsequent CRITERIA statements.
  • The CLUSTER= option is not specified.
  • The CLUSTERS_ONLY= option is not specified.

OUT=output-data-set

specifies the name of the output data set for match codes created with the DQMATCH procedure. The DQMATCH procedure creates match codes for specified character variables in an input data set.

Note If the specified output data set does not exist, the DQMATCH procedure creates it.