SAS Quality Knowledge Base for Contact Information 26
Definitions for the English, India locale are described below.
Case Definitions
Extraction Definitions
Gender Analysis Definitions
Identification Analysis Definitions
Match Definitions
Parse Definitions
Pattern Analysis Definitions
Standardization Definitions
Inherited Definitions
Proper (Address) | ||
---|---|---|
Description | The Proper (Address) case definition propercases addresses. | |
Examples | Input | Output |
405 SAMEDH TOWER SHAM NAGAR NR ADARASH PETROL PUMP | 405 Samedh Tower Sham Nagar nr Adarash Petrol Pump | |
30 OMKAR HOUSE C G ROAD NAVRANGPURA | 30 Omkar House C G Road Navrangpura | |
Remarks |
Proper (Name) | ||
---|---|---|
Description | The Proper (Name) case definition propercases names of individuals. | |
Examples | Input | Output |
AMRITLAL M DHAKA | Amritlal M Dhaka | |
pravin champalal bokadia | Pravin Champalal Bokadia | |
Remarks |
None.
Name | ||
---|---|---|
Description | The Name gender analysis definition determines the gender of a name. | |
Possible Outputs | M F U |
|
Examples | Input | Output |
Miss. Neha Patel | F | |
Vora Praveen | M | |
Dr. Pandey | U | |
Remarks |
None.
Address | ||
---|---|---|
Description | The Address match definition generates match codes which can be used to cluster records containing addresses. | |
Max Length of Match Code | 86 characters | |
Examples | Input | Cluster ID |
B 8 Kapol Society Marve Road Malad Mumbai | 0 | |
A 310 Kapol Nivas Marve Road Near Green Park Malad | 0 | |
JIMIT APARTMENT 2ND FLOOR NEAR KAPOL SOC MARVE RD ROHINI | 1 | |
IBM LTD JIMIT APARTMENT 4TH FLOOR MARVE RD NEAR KASTURI NAGAR ROHINI | 1 | |
Remarks |
|
|
The time it takes to execute matching for Indian addresses is mainly driven by the parsing that takes place internally in this match definition. If your input is already parsed, you can save a significant amount of execution time when you use the Match Codes (Parsed) node in DataFlux Data Management Studio. |
City | ||
---|---|---|
Description | The City match definition generates match codes which can be used to cluster records containing city names. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
Mumbai | 0 | |
Bombay | 0 | |
Ambala Sadar | 1 | |
Remarks |
|
City - State/Province - Postal Code | ||
---|---|---|
Description | The City - State/Province - Postal Code match definition generates match codes which can be used to cluster records containing last line address information. | |
Max Length of Match Code | 22 characters | |
Examples | Input | Cluster ID |
Buria Pin 110025 Mizoram | 0 | |
Mizoram Buria - 110025 | 0 | |
Charkhi Dadri Nagaland - 110026 | 1 | |
Remarks |
|
Name | ||
---|---|---|
Description | The Name match definition generates match codes which can be used to cluster records containing names of individuals. | |
Max Length of Match Code | 20 characters | |
Examples | Input | Cluster ID |
Navneet Nischal | 0 | |
Mr. Navnit Nischal, Dir | 0 | |
Deepak Parekh | 1 | |
Remarks |
|
Name (with Extensions) | ||
---|---|---|
Description | The Name (with Extensions) match definition generates match codes which can be used to cluster records containing names of individuals. | |
Max Length of Match Code | 50 characters | |
Examples | Input | Cluster ID |
V V S Laxman | 0 | |
Mr. V V P D Laxxman ,CEO | 0 | |
Mr. A V D Rajendra | 1 | |
Remarks |
|
Phone | ||
---|---|---|
Description | The Phone match definition generates match codes which can be used to cluster records containing phone numbers. | |
Max Length of Match Code | 16 characters | |
Examples | Input | Cluster ID |
MOB-9869000595 | 0 | |
98690-00595 | 0 | |
2871565 | 1 | |
Remarks |
|
PinCode | ||
---|---|---|
Description |
The PinCode match definition generates match codes which can be used to cluster records containing postal codes. |
|
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
PIN 110035 | 0 | |
110035 | 0 | |
110018 | 1 | |
Remarks |
|
Postal Code | ||
---|---|---|
Description | The Postal Code match definition generates match codes which can be used to cluster records containing postal codes. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
PIN 110035 | 0 | |
PIN 110035 | 0 | |
110018 | 1 | |
Remarks |
|
|
This is the same behavior as PinCode. This definition allows the user to invoke postal code matching across locales with a single definition call, because a definition with this name is implemented on all locales. |
State | ||
---|---|---|
Description | The State match definition generates match codes which can be used to cluster records containing names of states. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
J & K | 0 | |
Jammu and Kashir | 0 | |
Andaman | 1 | |
Andaman & Nicobar | 1 | |
Remarks |
|
Address | |||
---|---|---|---|
Description | The Address parse definition parses addresses into a set of tokens. | ||
Output Tokens | Extension Building/Wing Number Building Name Plot Phase Street Sector Part Neighborhood At Post Locality Landmark Scheme Layout Survey Cross Main Stage Block Tehsil Additional Info |
||
Example 1 | Input | Output Token | Output |
402 Shalimar Manzil Layout No 2 Survey No 3 Tehsil Manas Thane | Extension | 402 | |
Building/Wing Number | |||
Building Name | Shalimar Manzil | ||
Plot | |||
Phase | |||
Street | |||
Sector | |||
Part | |||
Neighborhood | |||
At Post | |||
Locality | |||
Landmark | |||
Scheme | |||
Layout | Layout No 2 | ||
Survey | Survey No 3 | ||
Cross | |||
Main | |||
Stage | |||
Block | |||
Tehsil | Tehsil Manas | ||
Additional Info | Thane | ||
Example 2 | Input | Output Token | Output |
Shah Associates Ltd B/8/1 A Wing Anand Dham Soc Sector 4 Part No 2 12 A Main Survey No 1 Layout No 12 Shankar Rd Malad West Mumbai Maharashtra 400064 India | Extension | B/8/1 | |
Building/Wing Number | A Wing | ||
Building Name | Anand Dham Soc | ||
Plot | |||
Phase | |||
Street | Shankar Rd | ||
Sector | Sector 4 | ||
Part | Part No 2 | ||
Neighborhood | |||
At Post | |||
Locality | Malad West | ||
Landmark | |||
Scheme | |||
Layout | Layout No 12 | ||
Survey | Survey No 1 | ||
Cross | |||
Main | 12 A Main | ||
Stage | |||
Block | |||
Tehsil | |||
Additional Info | Shah Associates Ltd Mumbai Maharashtra 400064 India | ||
Example 3 | Input | Output Token | Output |
TCS Ltd Third Floor Amardeep Plaza 12 Stage Scheme No - 12 Pocket Z Block 12 12 A Cross Opp Sai Temple Kala Nagar Bangalore Karnataka India | Extension | Third Floor | |
Building/Wing Number | |||
Building Name | Amardeep Plaza | ||
Plot | |||
Phase | |||
Street | |||
Sector | |||
Part | |||
Neighborhood | Kala Nagar | ||
At Post | |||
Locality | |||
Landmark | Opp Sai Temple | ||
Scheme | Scheme No - 12 | ||
Pocket Z | |||
Layout | |||
Survey | |||
Cross | 12 A Cross | ||
Main | |||
Stage | 12 Stage | ||
Block | Block 12 | ||
Tehsil | |||
Additional Info | TCS Ltd Bangalore Karnataka India | ||
Remarks |
|
Address (Full) | |||
---|---|---|---|
Description | The Address (Full) parse definition parses addresses containing complete two-line addresses into a set of tokens. | ||
Output Tokens | Extension Building/Wing Number Building Name Plot Phase Street Sector Part Neighborhood At Post Locality Landmark Scheme Layout Survey Cross Main Stage Block Tehsil City Pincode State Additional Info |
||
Example 1 | Input | Output Token | Output |
B-8 Kapol Soc Marve Road Malad West Mumbai Maharashtra 400060 | Extension | B-8 | |
Building/Wing Number | |||
Building Name | Kapol Soc | ||
Plot | |||
Phase | |||
Street | Marve Road | ||
Sector | |||
Part | |||
Neighborhood | |||
At Post | |||
Locality | Malad West | ||
Landmark | |||
Scheme | |||
Layout | |||
Survey | |||
Cross | |||
Main | |||
Stage | |||
Block | |||
Tehsil | |||
City | Mumbai | ||
Pincode | 400060 | ||
State | Maharashtra | ||
Additional Info | |||
Example 2 | Input | Output Token | Output |
A/1212 Saraswati CHS Plot No 1 23 rd Phase Sector No 12 Mahakali Road Govind Nagar Kandivali Mumbai 400064 Maharashtra India | Extension | A/1212 | |
Building/Wing Number | |||
Building Name | Saraswati CHS | ||
Plot | Plot No 1 | ||
Phase | 23 rd Phase | ||
Street | Mahakali Road | ||
Sector | Sector No 12 | ||
Part | |||
Neighborhood | Govind Nagar | ||
At Post | |||
Locality | Kandivali | ||
Landmark | |||
Scheme | |||
Layout | |||
Survey | |||
Cross | |||
Main | |||
Stage | |||
Block | |||
Tehsil | |||
City | Mumbai | ||
Pincode | 400064 | ||
State | MAHARASHTRA | ||
Additional Info | India | ||
Example 3 | Input | Output Token | Output |
C/1 Appejay House Plot No 12 Jyoti Scheme Karvaya Layout Pocket 12 Survey No 1 Nr K C College Mumbai 400064 | Extension | C/1 | |
Building/Wing Number | |||
Building Name | Appejay House | ||
Plot | Plot No 12 | ||
Phase | |||
Street | |||
Sector | |||
Part | |||
Neighborhood | |||
At Post | |||
Locality | |||
Landmark | Nr K C College | ||
Scheme | Jyoti Scheme | ||
Pocket 12 | |||
Layout | Karvaya Layout | ||
Survey | Survey No 1 | ||
Cross | |||
Main | |||
Stage | |||
Block | |||
Tehsil | |||
City | Mumbai | ||
Pincode | 400064 | ||
State | |||
Additional Info | |||
Remarks |
|
Address (Global) | |||
---|---|---|---|
Description |
The Address (Global) parse definition parses addresses into a globally recognized set of tokens. |
||
Output Tokens | Recipient Building/Site Street Extension PO Box Additional Info |
||
Example 1 | Input | Output Token | Output |
1051 3RD FLOOR MAHAVIR BHAWAN,SITA RAM NAGAR,NEW DELHI 200021 | Recipient | ||
Building/Site | MAHAVIR BHAWAN | ||
Street | SITA RAM NAGAR | ||
Extension | 1051 3RD FLOOR | ||
PO Box | |||
Additional Info | NEW DELHI 200021 | ||
Example 2 | Input | Output Token | Output |
61 SAWYAM SIDHA COLONY WEST AVEVUE ROAD ,NEW DELHI | Recipient | ||
Building/Site | SAWYAM SIDHA COLONY | ||
Street | WEST AVEVUE ROAD | ||
Extension | 61 | ||
PO Box | |||
Additional Info | NEW DELHI | ||
Example 3 | Input | Output Token | Output |
D-54 IST FLOOR, BATLA HOUSE, OKHLA ,NEW DELHI | Recipient | ||
Building/Site | BATLA HOUSE | ||
Street | OKHLA | ||
Extension | D-54 IST FLOOR | ||
PO Box | |||
Additional Info | NEW DELHI | ||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
City - State/Province - Postal Code | |||
---|---|---|---|
Description |
The City - State/Province - Postal Code parse definition parses last line address information into a set of tokens. |
||
Output Tokens | City PinCode State/Union Territory |
||
Example | Input | Output Token | Output |
Buria Mizoram - 110025 | City | Buria | |
PinCode | 110025 | ||
State/Union Territory | Mizoram | ||
Remarks |
City - State/Province - Postal Code (Global) | |||
---|---|---|---|
Description | The City - State/Province - Postal Code (Global) parse definition parses last line address information into a globally recognized set of tokens. | ||
Output Tokens | City State/Province Postal Code Additional Info |
||
Example | Input | Output Token | Output |
Buria Mizoram - 110025 | City | Buria | |
State/Province | Mizoram | ||
Postal Code | 110025 | ||
Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
Name | |||
---|---|---|---|
Description |
The Name parse definition parses names of individuals into a set of tokens. |
||
Output Tokens | Prefix Given Name Middle Name Family Name Suffix Title/Additional Info |
||
Example 1 | Input | Output Token | Output |
Mrs. Asha A Ghelani | Prefix | Mrs. | |
Given Name | Asha | ||
Middle Name | A | ||
Family Name | Ghelani | ||
Suffix | |||
Title/Additional Info | |||
Example 2 | Input | Output Token | Output |
Jason Santosh Rebello,CTO Accenture India | Prefix | ||
Given Name | Jason | ||
Middle Name | Santosh | ||
Family Name | Rebello | ||
Suffix | |||
Title/Additional Info | CTO Accenture India | ||
Example 3 | Input | Output Token | Output |
Mrs. Tina Peter D'mello Jr. | Prefix | Mrs. | |
Given Name | Tina | ||
Middle Name | Peter | ||
Family Name | D'mello | ||
Suffix | Jr. | ||
Title/Additional Info | |||
Remarks |
Name (Matching) | |||
---|---|---|---|
Description | The Name (Matching) parse definition parses names of individuals into a set of tokens. | ||
Output Tokens | Prefix Given Name Middle Name Family Name Suffix Title/Additional Info |
||
Example 1 | Input | Output Token | Output |
Mrs. Asha A Ghelani | Prefix | Mrs. | |
Given Name | Asha A | ||
Middle Name | |||
Family Name | Ghelani | ||
Suffix | |||
Title/Additional Info | |||
Example 2 | Input | Output Token | Output |
Jason Santosh Rebello,CTO Accenture India | Prefix | ||
Given Name | Jason Santosh | ||
Middle Name | |||
Family Name | Rebello | ||
Suffix | |||
Title/Additional Info | CTO Accenture India | ||
Example 3 | Input | Output Token | Output |
Mrs. Tina Peter D'mello Jr. | Prefix | Mrs. | |
Given Name | Tina Peter | ||
Middle Name | |||
Family Name | D'mello | ||
Suffix | Jr. | ||
Title/Additional Info | |||
Remarks | This definition is currently used by the Name match definition. All given names are parsed into the Given Name token, and the Middle Name token is never used. | ||
The Name (Matching) parse definition is no longer supported. It is now deprecated and will be removed in a future release of the QKB. |
Name (with Extensions) | |||
---|---|---|---|
Description | The Name (with Extensions) parse definition parses names of individuals into a set of tokens. | ||
Output Tokens | Prefix Extension Initial 1 Extension Initial 2 Extension Initial 3 Extension Initial 4 Given Name Given Name Extension Middle Name Middle Name Extension Family Name Suffix Title/Additional Info |
||
Example 1 | Input | Output Token | Output |
Mr. Yogesh Kumar Mahesh Kumar Chabbaria, President | Prefix | Mr. | |
Extension Initial 1 | |||
Extension Initial 2 | |||
Extension Initial 3 | |||
Extension Initial 4 | |||
Given Name | Yogesh | ||
Given Name Extension | Kumar | ||
Middle Name | Mahesh | ||
Middle Name Extension | Kumar | ||
Family Name | Chabbaria | ||
Suffix | |||
Title/Additional Info | President | ||
Example 2 | Input | Output Token | Output |
Mrs. F D Devi Pratap Rajput | Prefix | Mrs. | |
Extension Initial 1 | F | ||
Extension Initial 2 | D | ||
Extension Initial 3 | |||
Extension Initial 4 | |||
Given Name | Devi | ||
Given Name Extension | |||
Middle Name | Pratap | ||
Middle Name Extension | |||
Family Name | Rajput | ||
Suffix | |||
Title/Additional Info | |||
Example 3 | Input | Output Token | Output |
Mr. V V S Laxman Narayan Nayar, CEO ICICI | Prefix | Mr. | |
Extension Initial 1 | V | ||
Extension Initial 2 | V | ||
Extension Initial 3 | S | ||
Extension Initial 4 | |||
Given Name | Laxman | ||
Given Name Extension | |||
Middle Name | Narayan | ||
Middle Name Extension | |||
Family Name | Nayar | ||
Suffix | |||
Title/Additional Info | CEO ICICI | ||
Remarks |
|
Name (Global) | |||
---|---|---|---|
Description | The Name (Global) parse definition parses names of individuals into a globally recognized set of tokens. | ||
Output Tokens | Prefix Given Name Middle Name Family Name Suffix Title/Additional Info |
||
Example 1 | Input | Output Token | Output |
Mr. Dinkar Sathe | Prefix | Mr. | |
Given Name | Dinkar | ||
Middle Name | |||
Family Name | Sathe | ||
Suffix | |||
Title/Additional Info | |||
Example 2 | Input | Output Token | Output |
Mr. Deepak Keshavji Chheda | Prefix | Mr. | |
Given Name | Deepak | ||
Middle Name | Keshavji | ||
Family Name | Chheda | ||
Suffix | |||
Title/Additional Info | |||
Example 3 | Input | Output Token | Output |
Mrs. Dyna Peter D'souza, Vice President | Prefix | Mrs. | |
Given Name | Dyna | ||
Middle Name | Peter | ||
Family Name | D'souza | ||
Suffix | |||
Title/Additional Info | Vice President | ||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
Phone | |||
---|---|---|---|
Description |
The Phone parse definition parses phone numbers into a set of tokens. |
||
Output Tokens | Prefix Country Code Area Code Base Number/Cell Number Extension Word Extension Suffix Additional Info |
||
Example 1 | Input | Output Token | Output |
+91-2027290597 x 211 | Prefix | ||
Country Code | +91 | ||
Area Code | 20 | ||
Base Number/Cell Number | 27290597 | ||
Extension Word | x | ||
Extension | 211 | ||
Suffix | |||
Additional Info | |||
Example 2 | Input | Output Token | Output |
6422381/6431746 | Prefix | ||
Country Code | |||
Area Code | |||
Base Number/Cell Number | 6422381 | ||
Extension Word | |||
Extension | |||
Additional Info | 6431746 | ||
Remarks |
Phone (Global) | |||
---|---|---|---|
Description | The Phone (Global) parse definition parses phone numbers into a globally recognized set of tokens. | ||
Output Tokens | Country Code Area Code Base Number Extension Line Type Additional Info |
||
Example | Input | Output Token | Output |
+91-2027290597 x 211 | Country Code | +91 | |
Area Code | 20 | ||
Base Number | 27290597 | ||
Extension | 211 | ||
Line Type | |||
Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
None.
Address (Generic) | ||
---|---|---|
Description | The Address (Generic) standardization definition standardizes addresses. | |
Examples | Input | Output |
B/8 Kapol Soc Marve Rd Malad Mumbai | B/8 Kapol Society Marve Road Malad Mumbai | |
B/8 Jimit Apt Marve Road Bangur Ngr Goregaon East | B/8 Jimit Apartment Marve Road Bangur Nagar Goregaon East | |
Remarks | This standardization definition for Address (Generic) performs only simple transformations; it does not attempt complex transformations based on internal parsing. It is recommended for use when standardizing addressess that cannot be parsed successfully. For more information, refer to the Standardization definition for Address (Remarks section). In some cases, if changing the data before parsing is appropriate, you may get better parse results if you standardize the data using this definition before doing the parse. |
City | ||
---|---|---|
Description |
The City standardization definition standardizes city names. |
|
Examples | Input | Output |
Bombay | Mumbai | |
Calcutta | Kolkatta | |
Remarks | Common city abbreviations are expanded into full names. |
City - State/Province - Postal Code | ||
---|---|---|
Description | The City - State/Province - Postal Code standardization definition standardizes last line address information. | |
Examples | Input | Output |
Faridabad Uttaranchal - 110034 | Faridabad, 110034, Uttaranchal | |
Remarks |
Name | ||
---|---|---|
Description | The Name standardization definition standardizes names of individuals. | |
Examples | Input | Output |
CEO Nishit V Shah | Nishit V Shah, CEO | |
Professor Ranjan Verma | Prof Ranjan Verma | |
Remarks |
Phone | ||
---|---|---|
Description | The Phone standardization definition standardizes phone numbers for domestic use. | |
Examples | Input | Output |
91 22-6983000 | +91-22-6983000 | |
911452572 ext. 45 | +91-14-52572 X 45 | |
Remarks |
PinCode | ||
---|---|---|
Description | The PinCode standardization definition standardizes postal codes. | |
Example | Input | Output |
PO411006 | PO 411006 | |
Remarks |
Postal Code | ||
---|---|---|
Description | The Postal Code standardization definition standardizes postal codes. | |
Example | Input | Output |
PO411006 | PO 411006 | |
Remarks | This is the same behavior as PinCode. This standardization definition allows you to invoke postal code standard across locales with a single definition call, because a definition with this name is implemented on all locales. |
State | ||
---|---|---|
Description |
The State standardization definition standardizes state and union territory names. |
|
Examples | Input | Output |
A.P. | Andhra Pradesh | |
nicobar | Andaman and Nicobar Islands | |
Remarks |
In addition to the definitions listed on this page, the English, India locale also inherits all definitions for the English language and all Global definitions.
Documentation Feedback: yourturn@sas.com |
Doc ID: QKBCI_ENIND_defs.html |