SAS Quality Knowledge Base for Contact Information 27

Address

Standardization Definition

Address
Description The Address standardization definition standardizes addresses.
Examples Input Output
B/8 Kapol Soc Marve Rd Malad Mumbai B/8, Kapol Society, Marve Road, Malad, Mumbai
B/8 Jimit Apt Marve Road Bangur Ngr Goregaon East B/8, Jimit Apartment, Marve Road, Bangur Nagar, Goregaon East
Remarks This standardization definition performs context-specific transformations based on an internal parse of the input data string. Successful parsing is vital to achieve accurate results for the standardization. However, the complexity of Indian addresses makes parsing difficult in many cases. Therefore, we recommend the following method for standardizing Indian addresses:

1. Parse address strings using the Address parse definition. Store the result code for the parse along with the parse output.

2. Separate the parsed output in two branches. One branch should contain the output for successful parses (result code OK). The other branch should contain the output for unsuccessful parses (result codes NO SOLUTION and ABANDONED).

3. Standardize the successfully parsed records using the Address standardization definition. (It is recommended that you input the pre-parsed token values into the Standardization definition to avoid a time-consuming internal parse.)

4. Standardize the records in the second branch using the Address (Generic) standardization definition.

5. Merge the standardized results from each branch.