| The DQMATCH Procedure |
| Requirement: | At least one CRITERIA statement is required in a DQMATCH procedure step. |
|
CRITERIA DELIMSTR=delimited-input-variable | VAR=input-variable MATCHDEF=match-definition | EXACT <SENSITIVITY=complexity-value> <MATCHCODE=output-variable> <CONDITION=integer> ; |
specifies the name of a variable that has been parsed by the DQPARSE function, or contains tokens that were added with the DQPARSETOKENPUT function.
| Restriction: | You cannot specify the DELIMSTR= option and the VAR= option in the same CRITERIA statement. |
specifies the name of the character variable that is used to create match codes. The values of this variable cannot contain delimiters that were added with the functions DQPARSE or DQPARSETOKENPUT. If the variable contains delimited values, use the DELIMSTR= option instead of the VAR= option.
defines the match definition that is used to create the match code for the specified variable.
| Restriction: | the match definition must exist in the locale that is specified in the LOCALE= option of the DQMATCH procedure statement. |
assigns a cluster number based on an exact character match between values rather than a match between match codes.. If you specify EXACT you cannot specify:
MATCHDEF=
MATCHCODE=
SENSITIVITY=
| Restriction: | If the CLUSTER= option has not been assigned a variable in the DQMATCH procedure, then cluster numbers are assigned to a variable named CLUSTER by default |
(optional) determines the amount of information in the resulting match codes. Higher sensitivity values create match codes that contain more information about the input values. Higher sensitivity levels result in a greater number of clusters, with fewer values in each cluster. Valid values range from 50 to 95. The default value is 85.
(optional) specifies the name of the variable that receives the match codes for the character variable that is specified in the VAR= or DELIMSTR= option.
In the CRITERIA statement, the value of the MATCHCODE= option is not valid if you also specify the MATCHCODE= option in the PROC DQMATCH statement.
If you are using multiple CRITERIA statements in a single procedure step, you must either:
specify the MATCHCODE= variable in each CRITERIA statement
or generate composite match codes by specifying only the MATCHCODE= option in the PROC DQMATCH statement
(optional) groups CRITERIA statements to constrain the assignment of cluster numbers. Multiple CRITERIA statements with the same CONDITION= value are all required to match the values of an existing cluster to receive the number of that cluster. The CRITERIA statements are applied as a logical AND. If more than one CONDITION= number is defined in a series of CRITERIA statements, then a logical OR is applied across all CONDITION= values. In a table of customer information, you can assign cluster numbers based on matches between the customer name AND the home address. You could also assign cluster numbers on the customer name and organization address.
All CRITERIA statements that lack a CONDITION= value receive a cluster number based on a logical AND of all such CRITERIA statements.
| Default: | 1 |
| Restriction: | If you specify a value for MATCHCODE= in PROC DQMATCH, and if you specify more than one CONDITION= value, then SAS generates an error. To prevent the error, specify MATCHCODE= in the CRITERIA statements only. |
| Note: | if you have not assigned a value to the CLUSTER= option in the DQMATCH procedure, then cluster numbers are assigned to a variable named CLUSTER by default. |
| Details |
Match codes are created for the input variables that are specified in each CRITERIA statement. The resulting match codes are stored in the output variables that are named in the MATCHCODE= option. The MATCHCODE= option can be specified in the PROC DQMATCH statement or the CRITERIA statement.
Simple match codes are created when the CRITERIA statements specify different values for their respective MATCHCODE= options. Composite match codes are created when two or more CRITERIA statements specify the same value for their respective MATCHCODE= options.
To create match codes for a parsed character variable, specify the DELIMSTR= option instead of the VAR= option. In the MATCHDEF= option, be sure to specify the name of the match definition. This definition is associated with the parse definition that was used to add delimiters to the character variable. To determine the parse definition that is associated with a match definition, use the DQMATCHINFOGET function.
See: DQMATCHINFOGET Function.
Copyright © 2009 by SAS Institute Inc., Cary, NC, USA. All rights reserved.