DQMATCH Procedure

Overview: DQMATCH Procedure

What Does the DQMATCH Procedure Do?

PROC DQMATCH creates match codes as a basis for standardization or transformation. The match codes reflect the relative similarity of data values. Match codes are created based on a specified match definition in a specified locale. The match codes are written to an output SAS data set. Values that generate the same match codes are candidates for transformation or standardization.
The DQMATCH procedure can generate cluster numbers for input values that generate identical match codes. Cluster numbers are not assigned to input values that generate unique match codes. Input values that generate a unique match code (no cluster number) can be excluded from the output data set. Blank values can be retained in the output data set, and they can receive a cluster number.
A specified sensitivity level determines the amount of information in the match codes. The amount of information in the match code determines the number of clusters and the number of entries in each cluster. Higher sensitivity–levels produce fewer clusters, with fewer entries per cluster. Use higher sensitivity–levels when you need matches that are more exact. Use lower sensitivity–levels to sort data into general categories or to capture all values that use different spellings to convey the same information.