SAS Quality Knowledge Base for Contact Information 25
Definitions for the English, United States locale are described below.
Case Definitions
Extraction Definitions
Gender Analysis Definitions
Identification Analysis Definitions
Match Definitions
Parse Definitions
Pattern Analysis Definitions
Standardization Definitions
Inherited Definitions
Proper (Address) | ||
---|---|---|
Description | The Proper (Address) case definition propercases addresses. | |
Input | Output | |
Examples | 11TH FLOOR | 11th Floor |
po box 125 | PO Box 125 | |
Remarks |
Proper (City - State/Province - Postal Code) | ||
---|---|---|
Description |
The Proper (City - State/Province - Postal Code ) case definition propercases last line address information. |
|
Example | Input | Output |
cary, nc 27513 | Cary, NC 27513 | |
Remarks |
Contact Info | |||
---|---|---|---|
Description |
The Contact Info extraction definition extracts the Name, Organization, Address, E-mail, and Phone from a string. |
||
Possible Outputs | Name Organization Address Phone Additional Info |
||
Example | Input | Output | |
Mr John Smith, 100 SAS Campus Dr, PO Box 12345, Cary, NC 27513, 919-531-0000 | NAME | Mr John Smith | |
ORGANIZATION | |||
ADDRESS | 100 SAS Campus Dr, PO Box 12345, Cary, NC 27513 | ||
PHONE | 919-531-0000 | ||
ADDITIONAL INFO | ,; , | ||
Remarks |
None.
Contact Info | ||
---|---|---|
Description |
The Contact Info identification analysis identifies the contact information that is represented by a string. |
|
Possible Outputs | NAME ORGANIZATION PHONE ADDRESS BLANK MIXED UNKNOWN |
|
Examples | Input | Output |
SAS Institute | ORGANIZATION | |
John Smith | NAME | |
Joe Smith DBA Some Company Name | MIXED | |
john.smith@sas.com | ||
919-531-0000 | PHONE | |
100 SAS Campus Dr, Cary, NC 27513 | ADDRESS | |
BLANK | ||
John Smith, 100 SAS Campus Dr, Cary, NC 27513, john.smith@sas.com | MIXED | |
Fisher | UNKNOWN | |
Remarks |
Phone (Validation) | ||
---|---|---|
Description |
The Phone (Validation) identification analysis determines whether a string represents a valid phone number. |
|
Possible Outputs | VALID INVALID |
|
Examples | Input | Output |
919-447-3000 | VALID | |
888 8888 | VALID | |
888 888 8888 | INVALID | |
2468 | INVALID | |
Remarks |
Address | ||
---|---|---|
Description | The Address match definition generates match codes which can be used to cluster records containing addresses. | |
Max Length of Match Code | 44 characters | |
Examples | Input | Cluster ID |
52 Commerce Street | 0 | |
52 Commerce St | 0 | |
52 Comerce St | 0 | |
52 Commerce Street, PO Box 1234 | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
|
Address (Long) | ||
---|---|---|
Description |
The Address (Long) match definition generates match codes which can be used to cluster records containing addresses. |
|
Max Length of Match Code | 20 characters | |
Examples | Input | Cluster ID |
420 Park Royale Rd | 0 | |
420 Park Royale Road | 0 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). The Address (Long) match definition is no longer supported. It is now deprecated and will be removed in a future release of the QKB. Please change your jobs to use the Address match definition. |
City | ||
---|---|---|
Description |
The City match definition generates match codes which can be used to cluster records containing city names. |
|
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
Cary | 0 | |
Carie | 0 | |
Durham | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
City - State/Province - Postal Code | ||
---|---|---|
Description |
The City - State/Province - Postal Code match definition generates match codes which can be used to cluster records containing last line address information. |
|
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
Cary, NC 27653 | 0 | |
Durham, NC 27713 | 1 | |
Durham, North Carolina 27713 | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Phone | ||
---|---|---|
Description |
The Phone match definition generates match codes which can be used to cluster records containing phone numbers. |
|
Max Length of Match Code | 22 characters | |
Examples | Input | Cluster ID |
1-800-DATAFLUX | 0 | |
1 (800) 328-2358 | 0 | |
345-6789 | 1 | |
345-6780 | 1 | |
345-6700 | 2 | |
447-3000 ext 1234 | 3 | |
447-3000 ext 1266 | 3 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Postal Code | ||
---|---|---|
Description |
The Postal Code match definition generates match codes which can be used to cluster records containing postal codes. |
|
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
27653 | 0 | |
27653-0001 | 0 | |
27713 | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
State/Province | ||
---|---|---|
Description |
The State/Province match definition generates match codes which can be used to cluster records containing states and provinces. |
|
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
Florida | 0 | |
FL | 0 | |
North Carolina | 1 | |
N. Carolina | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Text | ||
---|---|---|
Description |
The Text match definition generates match codes which can be used to cluster records containing general text strings. |
|
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
they went | 0 | |
you are | 1 | |
you're | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Text (Long) | ||
---|---|---|
Description |
The Text (Long) match definition generates match codes which can be used to cluster records containing longer general text strings. |
|
Max Length of Match Code | 30 characters | |
Examples | Input | Cluster ID |
they went | 0 | |
you are | 1 | |
you're | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Address | |||
---|---|---|---|
Description | The Address parse definition parses addresses into a set of tokens. | ||
Output Tokens | Recipient Building/Site Street Extension PO Box Additional Info |
||
Example | Input | Output | |
Mr John Smith, Building T, Unit 102, 100 SAS Campus Dr, PO Box 12345 | Recipient | Mr John Smith | |
Building/Site | Building T | ||
Street | 100 SAS Campus Dr | ||
Extension | Unit 102 | ||
PO Box | PO Box 12345 | ||
Additional Info | |||
Remarks |
Address (Full) | |||
---|---|---|---|
Description | The Address (Full) parse definition parses addresses containing complete two-line addresses into a set of tokens. | ||
Output Tokens | Recipient Building/Site Street Extension PO Box City State/Province Postal Code Country Additional Info |
||
Example | Input | Output | |
Mr John Smith, Building T, Unit 102, 100 SAS Campus Dr, PO Box 12345, Cary, NC 27513 USA | Recipient | Mr John Smith | |
Building/Site | Building T | ||
Street | 100 SAS Campus Dr | ||
Extension | Unit 102 | ||
PO Box | PO Box 12345 | ||
City | Cary | ||
State/Province | NC | ||
Postal Code | 27513 | ||
Country | USA | ||
Additional Info | |||
Remarks |
Address (Global) | |||
---|---|---|---|
Description |
The Address (Global) parse definition parses addresses into a globally recognized set of tokens. |
||
Output Tokens | Recipient Building/Site Street Extension PO Box Additional Info |
||
Input | Output | ||
Example 1 | Mr John Smith, Building T, Unit 102, 100 SAS Campus Dr, PO Box 12345 | Recipient | Mr John Smith |
Building/Site | Building T | ||
Street | 100 SAS Campus Dr | ||
Extension | Unit 102 | ||
PO Box | PO Box 12345 | ||
Additional Info | |||
Input | Output | ||
Example 2 | 420 Park Ridge Rd | Recipient | |
Building/Site | |||
Street | 420 Park Ridge Rd | ||
Extension | |||
PO Box | |||
Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. | ||
The Address (Global) (v23) parse definition is now deprecated and will be removed in a future release of the QKB. The Address (Global) parse definition has been replaced with a copy of the Address (Global) (v23) definition which takes advantage of the new tokens and updated processing. If you changed your jobs to use Address (Global) (v23) it is suggested that you change them back. |
Address (Global) (v23) | |||
---|---|---|---|
Description |
The Address (Global) (v23) parse definition parses addresses into a globally recognized set of tokens. |
||
Output Tokens | Recipient Building/Site Street Extension PO Box Additional Info |
||
Input | Output | ||
Example 1 | Mr John Smith, Building T, Unit 102, 100 SAS Campus Dr, PO Box 12345 | Recipient | Mr John Smith |
Building/Site | Building T | ||
Street | 100 SAS Campus Dr | ||
Extension | Unit 102 | ||
PO Box | PO Box 12345 | ||
Additional Info | |||
Input | Output | ||
Example 2 | 420 Park Ridge Rd | Recipient | |
Building/Site | |||
Street | 420 Park Ridge Rd | ||
Extension | |||
PO Box | |||
Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. | ||
The Address (Global) (v23) parse definition is now deprecated and will be removed in a future release of the QKB. The Address (Global) parse definition has been replaced with a copy of the Address (Global) (v23) definition which takes advantage of the new tokens and updated processing. If you changed your jobs to use Address (Global) (v23) it is suggested that you change them back. |
City - State/Province - Postal Code | |||
---|---|---|---|
Description |
The City - State/Province - Postal Code parse definition parses last line address information into a set of tokens. |
||
Output Tokens | City State ZIP |
||
Example | Input | Output | |
Cary, NC 27653 | City | Cary | |
State | NC | ||
ZIP | 27653 | ||
Remarks |
City - State/Province - Postal Code (Global) | |||
---|---|---|---|
Description |
The City - State/Province - Postal Code (Global) parse definition parses last line address information into a globally recognized set of tokens. |
||
Output Tokens | City State/Province Postal Code Additional Info |
||
Example | Input | Output | |
Cary, NC 27653 | City | Cary | |
State/Province | NC | ||
Postal Code | 27653 | ||
Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
Name (Address Update) | |||
---|---|---|---|
Description |
The Name (Address Update) parse definition parses names of individuals into a set of tokens. |
||
Output Tokens | Prefix Given Name Middle Name Family Name Suffix Title/Additional Info |
||
Example 1 | Input | Output | |
ANDY JEFFERSON | Prefix | ||
Given Name | ANDY | ||
Middle Name | |||
Family Name | JEFFERSON | ||
Suffix | |||
Title/Additional Info | |||
Example 2 | Input | Output | |
MR CHIP BALTZER | Prefix | MR | |
Given Name | CHIP | ||
Middle Name | |||
Family Name | BALTZER | ||
Suffix | |||
Title/Additional Info | |||
Example 3 | Input | Output | |
ANTONIO FULLEN I | Prefix | ||
Given Name | ANTONIO | ||
Middle Name | |||
Family Name | FULLEN I | ||
Suffix | |||
Title/Additional Info | |||
Remarks | This definition is intended for use only with the Address Update Lookup feature in the DataFlux Data Management Platform. |
Name/Organization | |||
---|---|---|---|
Description | The Parse definition for Name/Organization parses strings that contain the names of individuals and organizations into a set of tokens. | ||
Output Tokens | Name Organization |
||
Example | Input | Output | |
Bob Brauer, DataFlux Corporation | Name | Bob Brauer | |
Organization | DataFlux Corporation | ||
Remarks |
Phone | |||
---|---|---|---|
Description |
The Phone parse definition parses phone numbers into a set of tokens. |
||
Output Tokens | Country Code Area Code Base Number Extension Line Type Additional Info |
||
Example 1 | Input | Output | |
Work: +1 (919) 447-3000 Ext 3118 (ask for John) | Country Code | +1 | |
Area Code | 919 | ||
Base Number | 447-3000 | ||
Extension | 3118 | ||
Line Type | Work: | ||
Additional Info | (ask for John) | ||
Example 2 | Input | Output | |
011 44 2012345000 | Country Code | 011 44 | |
Area Code | |||
Base Number | 2012345000 | ||
Extension | |||
Line Type | |||
Additional Info | |||
Example 3 | Input | Output | |
Toll Free: (800) 888-4257 | Country Code | ||
Area Code | 800 | ||
Base Number | 888-4257 | ||
Extension | |||
Line Type | Toll Free: | ||
Additional Info | |||
Example 4 | Input | Output | |
310 424-1442 3118 (w) | Country Code | ||
Area Code | 310 | ||
Base Number | 424-1442 | ||
Extension | 3118 | ||
Line Type | (w) | ||
Additional Info | |||
Remarks |
Phone (Global) | |||
---|---|---|---|
Description |
The Phone (Global) parse definition parses phone numbers into a globally recognized set of tokens. |
||
Output Tokens | Country Code Area Code Base Number Extension Line Type Additional Info |
||
Example 1 | Input | Output | |
Work: +1 (919) 447-3000 Ext 3118 (ask for John) | Country Code | +1 | |
Area Code | 919 | ||
Base Number | 447-3000 | ||
Extension | 3118 | ||
Line Type | Work: | ||
Additional Info | (ask for John) | ||
Example 2 | Input | Output | |
011 44 2012345000 | Country Code | 011 44 | |
Area Code | |||
Base Number | 2012345000 | ||
Extension | |||
Line Type | |||
Additional Info | |||
Example 3 | Input | Output | |
Toll Free: (800) 888-4257 | Country Code | ||
Area Code | 800 | ||
Base Number | 888-4257 | ||
Extension | |||
Line Type | Toll Free: | ||
Additional Info | |||
Example 4 | Input | Output | |
310 424-1442 3118 (w) | Country Code | ||
Area Code | 310 | ||
Base Number | 424-1442 | ||
Extension | 3118 | ||
Line Type | (w) | ||
Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
Postal Code | |||
---|---|---|---|
Description | The Parse definition for Postal Code parses zip codes into a set of tokens. | ||
Output Tokens | ZIP ZIP Add-on |
||
Example | Input | Output | |
27653-0001 | ZIP | 27653 | |
ZIP Add-on | 0001 | ||
Remarks |
None.
Address | ||
---|---|---|
Description | The Address standardization definition standardizes addresses. | |
Examples | Input | Output |
52 Commerce Street | 52 Commerce St | |
5TH avenue | 5th Ave | |
Remarks |
City | ||
---|---|---|
Description |
The City standardization definition standardizes city names. |
|
Examples | Input | Output |
cary | Cary | |
NY | New York | |
Remarks | Common city abbreviations are expanded into full names. |
City - State/Province - Postal Code | ||
---|---|---|
Description | The City - State/Province - Postal Code standardization definition standardizes last line address information. | |
Example | Input | Output |
cary nc 27513 | Cary, NC 27513 | |
Remarks |
Phone | ||
---|---|---|
Description |
The Phone standardization definition standardizes phone numbers for domestic use. |
|
Examples | Input | Output |
+1 602 961-1317 | (602) 961 1317 | |
0044 (0)20 12345000 | +44 2012345000 | |
4159745060 | (415) 974 5060 | |
(212) 987-7654 EXT 456 | (212) 987 7654 x456 | |
Remarks |
Phone (Electronic) | ||
---|---|---|
Description |
The Phone (Electronic) standardization definition standardizes phone numbers for automated calling systems. |
|
Examples | Input | Output |
Work: (301) 889-9200 (ask for Mary) | +13018899200 | |
Fax: 602 9611317 | +16029611317 | |
0044 (0)20 12345000 | +442012345000 | |
612.736.4436 | +16127364436 | |
Remarks |
Phone (with Country Code) | ||
---|---|---|
Description |
The Phone (with Country Code) standardization definition standardizes phone numbers for international use. |
|
Examples | Input | Output |
(212) 987-7654 | +1 212 987 7654 | |
1(303)5466306 | +1 303 546 6306 | |
(414) 242-8202 - after 4pm | +1 414 242 8202, After 4PM | |
0044 (0)20 12345000 | +44 2012345000 | |
011-81-53-460-2871 | +81 534602871 | |
Remarks |
Postal Code | ||
---|---|---|
Description |
The Postal Code standardization definition standardizes postal codes. |
|
Examples | Input | Output |
27653 - 0001 | 27653-0001 | |
27653 0001 | 27653-0001 | |
Remarks |
State/Province (Abbreviation) | ||
---|---|---|
Description |
The State/Province (Abbreviation) standardization definition standardizes state names. |
|
Examples | Input | Output |
nc | NC | |
fla | FL | |
Remarks |
State/Province (Full Name) | ||
---|---|---|
Description | The State/Province (Full Name) standardization definition standardizes complete state names. | |
Examples | Input | Output |
n carolina | North Carolina | |
fla | Florida | |
Remarks |
In addition to the definitions listed on this page, the English, United States locale also inherits all definitions for the English language and all Global definitions.
Documentation Feedback: yourturn@sas.com
|
Doc ID: QKBCI_ENUSA_defs.html |