The DQMATCH Procedure |
What Does the DQMATCH Procedure Do? |
PROC DQMATCH creates match-codes as a basis for standardization or transformation. The match-codes reflect the relative similarity of data values. Match-codes are created based on a specified match definition in a specified locale. The match-codes are written to an output SAS data set. Values that generate the same match-codes are candidates for transformation or standardization.
The DQMATCH procedure can generate cluster numbers for input values that generate identical match-codes. Cluster numbers are not assigned to input values that generate unique match-codes. Input values that generate a unique match-code (no cluster number) can be excluded from the output data set. Blank values can be retained in the output data set. Blank values can receive a cluster number.
A specified sensitivity level determines the amount of information in the match-codes. The amount of information in the match-code determines the number of clusters and the number of entries in each cluster. Higher sensitivity-levels produce fewer clusters, with fewer entries per cluster. Use higher sensitivity-levels when you need matches that are more exact. Use lower sensitivity-levels to sort data into general categories or to capture all values that use different spellings to convey the same information.
Copyright © 2010 by SAS Institute Inc., Cary, NC, USA. All rights reserved.