The DQMATCH Procedure |
Requirement: | At least one CRITERIA statement is required in DQMATCH procedures. |
CRITERIA CONDITION=<integer>
DELIMSTR=<variable-name>|VAR=<variable-name> EXACT|MATCHDEF MATCHCODE=<output-character-variable> SENSITIVITY=<sensitivity-level>; |
Options |
groups CRITERIA statements to constrain the assignment of cluster numbers.
Multiple CRITERIA statements with the same CONDITION= value are all required to match the values of an existing cluster to receive the number of that cluster.
The CRITERIA statements are applied as a logical AND.
If more than one CONDITION= option is defined in a series of CRITERIA statements, then a logical OR is applied across all CONDITION= option values.
In a table of customer information, you can assign cluster numbers based on matches between the customer name AND the home address.
You can also assign cluster numbers on the customer name and organization address.
All CRITERIA statements that lack a CONDITION= option receive a cluster number based on a logical AND of all such CRITERIA statements.
Default: | 1 |
Restriction: | If you specify a value for the MATCHCODE= option in the DQMATCH procedure, and you specify more than one CONDITION= value, SAS generates an error. To prevent the error, specify the MATCHCODE= option in CRITERIA statements only. |
Note: | If you have not assigned a value to the CLUSTER= option in the DQMATCH procedure, cluster numbers are assigned to a variable named CLUSTER by default. |
See: | The DQMATCHINFOGET Function |
specifies the name of a variable.
Restriction: | You cannot specify the DELIMSTR= option and the VAR= option in the same CRITERIA statement. |
See: | The DQPARSE Function and the DQPARSETOKENPUT Function. |
specifies the name of a variable that has been parsed by the DQPARSE function, or contains tokens added with the DQPARSETOKENPUT function.
specifies the name of the character variable that is used to create match-codes. If the variable contains delimited values, use the DELIMSTR= option.
Restriction: | The values of this variable cannot contain delimiters added with the DQPARSE function or the DQPARSETOKENPUT function. |
assigns a cluster number.
Default: | If the CLUSTER= option has not been assigned a variable in the DQMATCH procedure, then cluster numbers are assigned to the variable named CLUSTER. |
Restriction: | If you specify the MATCHCODE= option in the DQMATCH procedure, the match-code is a composite of the exact character-value and the match-code that is generated by the match-definition. |
assigns a cluster number based on an exact character match between values.
Restriction: | If you specify the EXACT option you cannot specify the MATCHDEF= option, the MATCHCODE= option or the SENSITIVITY= option. |
specifies the match-definition that is used to create the match-code for the specified variable.
Restriction: | The match-definition must exist in the locale that is specified in the LOCALE= option of the DQMATCH procedure. |
Restriction: | If you specify the MATCHDEF= option, you cannot specify the EXACT option, the MATCHCODE= option, or the SENSITIVITY option. |
specifies the name of the variable that receives the match-codes for the character variable that is specified in the VAR= option or the DELIMSTR= option.
Restriction: | The MATCHCODE= option is not valid if you also specify the MATCHCODE= option in the DQMATCH procedure. |
Restriction: |
If you are using multiple CRITERIA
statements in a single procedure step, either:
|
determines the amount of information in the resulting match codes. Higher sensitivity values create match codes that contain more information about the input values. Higher sensitivity levels result in a greater number of clusters, with fewer values in each cluster.
Default: | The default value is 85. |
Valid values: | Valid values range from 50 to 95. |
Details |
Match codes are created for the input variables that are specified in each CRITERIA statement. The resulting match-codes are stored in the output variables that are named in the MATCHCODE= option. The MATCHCODE= option can be specified in the DQMATCH procedure or the CRITERIA statement.
Simple match-codes are created when the CRITERIA statements specify different values for their respective MATCHCODE= options. Composite match codes are created when two or more CRITERIA statements specify the same value for their respective MATCHCODE= options.
To create match codes for a parsed character variable, specify the DELIMSTR= option instead of the VAR= option. In the MATCHDEF= option, be sure to specify the name of the match-definition. This definition is associated with the parse definition that was used to add delimiters to the character variable. To determine the parse definition that is associated with a match definition, use the DQMATCHINFOGET function.
Copyright © 2010 by SAS Institute Inc., Cary, NC, USA. All rights reserved.