You are here: Definitions>English Definitions>English, United Kingdom Definitions

SAS Quality Knowledge Base for Contact Information 25

English, United Kingdom Definitions

Definitions for the English, United Kingdom locale are described below.

Case Definitions
Gender Analysis Definitions

Identification Analysis Definitions

Match Definitions

Parse Definitions

Pattern Analysis Definitions

Standardization Definitions

Inherited Definitions

Case Definitions

None.

Gender Analysis Definitions

None.

Identification Analysis Definitions

Contact Info
Description The Contact Info identification analysis identifies the contact information that is represented by a string.
Possible Outputs ADDR
CSZ
UNK
ATTN
ADDR2
BLANK
ORG
IND
ACCT
SW
PTW
STW
MW
POB
Examples Input Output
420 Park Ridge Rd ADDR
Guttermoth UNK
c/o Blanch ATTN
#900-A5 ADDR2
  BLANK
DataFlux Corp ORG
James Douglas Morrison IND
PO Box 15 POB
Remarks  

 

Individual/Organization
Description The Individual/Organization identification analysis definition determines whether a string represents the name of an individual or an organization.
Possible Outputs INDIVIDUAL
ORGANIZATION
UNKNOWN
Examples Input Output
DataFlux Corporation ORGANIZATION
Tony Fisher INDIVIDUAL
Guttermoth UNKNOWN
Remarks  

 

Phone
Description The Phone identification analysis definition determines the type of phone number.
Possible Outputs LANDLINE (GEOGRAPHIC)
LANDLINE (NON-GEOGRAPHIC)
MOBILE
LEGACY
FOREIGN
NO AREA CODE
INVALID
Examples Input Output
(020) 1234 5678 LANDLINE (GEOGRAPHIC)
0845 123 4567 LANDLINE (NON-GEOGRAPHIC)
1234 5678 NO AREA CODE
07770 123456 MOBILE
(0171) 123 4567 LEGACY
0370 123456 LEGACY
+1 (919) 457-7000 FOREIGN
(020) 1234 567 INVALID
07770 12345 INVALID
Remarks The final two examples are missing a digit in the base number.

 

Postal Code (Validation)
Description The Identification Analysis definition for Postal Code (Validation) determines whether the input string is a formally correct domestic postal code.
Possible Outputs OK (formally correct domestic postal code)
INVALID (not a formally correct domestic postal code)
Examples Input Output
DE11NN OK
DE1 1NN OK
B23 6EN OK
023 6EN INVALID
8236 EN INVALID
12345 INVALID
SE-123 45 INVALID
No Postal Code INVALID
Remarks  

Match Definitions

Address
Description The Address match definition generates match codes which can be used to cluster records containing addresses.
Max Length of Match Code 15 characters
Examples Input Cluster ID
The Chapter House, 32 Upton Rd, Covent Garden 1
Chapter House, 32 Upton Road 1
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Address (Full)
Description

The Address (Full) match definition generates match codes which can be used to cluster records containing complete two-line addresses.

Max Length of Match Code 64 characters
Examples Input Cluster ID
12 Arthur Street, London, EC4R 9BJ 0
Apt B, 12 Arthur Street, London, EC4R 9BJ 0
12 Arthur Road London EC4R9BJ 0
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Address (Full) (with Combinations)
Description The Address (Full) (with Combinations) match definition generates match codes which can be used to cluster records containing complete two-line addresses.
Max Length of Match Code 138 characters
Examples Input Cluster ID Sensitivity Score
Flat 1, 9, Rockleaze, Bristol, BS9 1NE 0 85 72.25
Moorlands 9 Rockleeze Smeed Park Bristol BS9 1NE 0 85 72.25
Philips Centre 420-430 London Road Surrey CR9 3QR 1 85 72.25
The Phillips Centre 420 London Road Croydon Surrey CR9 3QR 1 85 72.25
25 Kings Hill Avenue, Kings Hill, West Malling, Kent, ME19 4TA 2 85 72.25
c/o CAF, 25 Tingshill Avenue, Kingshill, West Malling, Kent, ME19 4TA 2 85 72.25
Highway House 171 Kings Rd Brentwood Essex CM14 4EJ 3 85 80.75
Pegasus House 171 Kings St Brentwood Essex CM14 4EJ 3 85 80.75
69 Morrison Street, Lothian EH3 8YF 4 85 80.75
69 Morrison, Lothian EH3 8YF 4 85 80.75
11th Floor SJMB 100 Old Hall Street Liverpool L70 1AB 5 85 85.00
14th Floor, Sir John Moores Building, 100 Old Hall Street, Liverpool, L70 1AB 5 85 85.00
97 Haymarket Terrace, Edinburgh, Mid Lothian EH12 5HD 6 85 85.00
Donaldson House, 97 Haymarket Terrace, Edinburgh, Lothian EH12 5HD 6 85 85.00
1 Hagley Road Birmingham West Midlands B16 8SS 7 85 85.00
Metropolitan House 1 Hagley Road Edgbaston Birmingham B16 8SS 7 85 85.00
CISD, Saughton House, Broomhouse Drive, Edinburgh, EH11 3XD 8 85 68.00
Saughton House, EH11 3XD 8 85 68.00
Civic Centre High Street Uxbrudge London UB8 1UW 9 85 68.00
Head Office, Civic Centre High Street, London, UB8 1UW 9 85 68.00
171 Kings Rd Brentwood Essex CM14 4EJ 10 80 48.00
171 Kings Rd Brentwood Essex 10 80 48.00
69 Morrison Street, Lothian EH3 8YF 11 80 48.00
69 Morrison, Lothian 11 80 48.00
9 Buckstone Oval, Alwoodley Leeds, W Yorkshire 12 75 30.00
9 Buckstone Oval, Alwoodley Leeds LS175HF 12 75 30.00
PO Box 3 CLS6 Gloucester Road Filton Bristol BS12 7QE 13 85 76.50
PO Box 3 WH-5 Bristol BS12 7QE 13 85 76.50
Dept DDSUPP Marlborough House PO BOX 1810 Bristol BS99 5SN 14 85 76.50
Sun Life Centre P O Box 1810 Bristol BS99 5SN 14 85 76.50
Austalasa House (HBBG), Waterside, PO Box 365, West Drayton, Middlesex UB7 0GB 15 85 76.50
Waterside P O Box 365 Harmonsworth West Drayton Middlesex UB7 0GB 15 85 76.50
PO Box 3 CLS6 Gloucester Road Filton Bristol BS127QE 16 80 63.75
PO Box 3 WH-5 Bristol 16 80 63.75
PO Box 365, West Drayton, Middlesex UB7 0GB 17 80 63.75
Box 365 Harmonsworth West Drayton Middlesex 17 80 63.75
6 Paddent Court, London NW71GY 18 85 80.75
6 Paddent Ct 12 Stockford Avenue London NW71GY 18 85 55.25
6 Paddent Court, Stockford, London NW71GY 18 85 55.25
4 Howes Place Cambridge Cambridgeshire CB30LD 19 85 80.75
4 Howes Place Huntingdon Rd Cambridge Cambridgeshire CB30LD 19 85 55.25
4 Howes Place, 33 Huntingdon Rd Cambridge Cambridgeshire CB30LD 19 85 55.25
Flat G 5 Bank Street Aberdeen AB117ST 20 65 45.50
5A, Bank St, Aberdeen Aberdeenshire AB117ST 20 65 45.50
Bank Street Aberdeen AB117ST 20 65 45.50
32 West Avenue Handsworth Birmingham West Midlands B202LS 21 65 45.50
9 West Boulevard Birmingham B202LS 21 65 45.50
West Rd B202LS 21 65 45.50
Dunsford Exeter Devon EX67AX 22 85 17.00
Dunsford, Exeter, Devon, EX6 7AX 22 85 17.00
E London EC3R7NE 23 85 17.00
East London, EC3R 7NE 23 85 17.00
Remarks

This Address (Full) (with Combinations) match definition generates one or more match codes for each input string. The number of match codes generated for an input string depends on the content of the string. Each match code represents a combination of different portions of the input string; this enables two strings to be matched even when some portions of one or both of the strings differ. See the examples above for an illustration of clusters that may be produced using match codes generated by this definition.

Note that a consequence of generating multiple match codes is that a record can be placed in more than one cluster by a subsequent clustering operation. Therefore, special attention should be given to the entity resolution process when using this definition.

A score is assigned to each match code produced by the definition, and might be used as a factor when resolving conflicts between clusters. The score is dependent on the set of rules that produced the match code and the sensitivity used when the job was executed.

 

City
Description The City match definition generates match codes which can be used to cluster records containing city names.
Max Length of Match Code 15 characters
Examples Input Cluster ID
Birmingham 10
bham 10
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

City - State/Province - Postal Code
Description The City - State/Province - Postal Code match definition generates match codes which can be used to cluster records containing last line address information.
Max Length of Match Code 16 characters
Examples Input Cluster ID
Sheffield, South Yorkshire, S8 8TJ 2
Shefield South Yorks. S8 8TJ 2
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Name (with Suggestions)
Description The Name (with Suggestions) match definition generates match codes which can be used to cluster records containing names of individuals.
Max Length of Match Code 21 characters
Examples Input Cluster ID
PRAIS HILTON 1
PARIS HILTON 1
HENRY NICKELSON 2
HENRY NICKERSON 2
NIKI WONG 3
ANIKI WONG 3
NIKI WONG 4
NICLOE WONG 4
Remarks

This match definition generates one or more match codes for each input string. Each match code represents a suggestion for what might be the true value of the input string; this enables two strings to be matched even when one or both strings contain a spelling mistake. For example, the name PRAIS might match the name PARIS, or the name NICLOE might match the name NIKI.

Note that a consequence of generating multiple match codes is that a record can be placed in more than one cluster by a subsequent clustering operation. Therefore, special attention should be given to the entity resolution process when using this definition.

Another consequence of generating multiple match codes is that more processing time is required than when generating a single match code. Generating match codes using this definition might take up to five times as long as generating match codes using a traditional match definition.

For more information on suggestion-based matching, refer to the Suggestion-Based Matching section of the DataFlux Data Management Studio Online Help.

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Phone
Description The Phone match definition generates match codes which can be used to cluster records containing phone numbers.
Max Length of Match Code 22 characters
Examples Input Cluster ID
01628 486933 0
(0)1628 486933 0
1628-486933 0
01628-486933 (Work) 0
+44 1628 486933 0
(020) 2345 6789 1
(020) 2345 6780 1
(020) 2345 6700 2
(0113) 234 5678 3
(0113) 234 5670 3
(0113) 234 5600 4
020 69999999 ext. 1234 5
020 69999999 ext. 8266 5
020 69999999 5
+1-226-823-5678 6
001 (226)-823-5678 6
+1 226-823-5678 (after 4pm) 6
Remarks

Note that the number of digits retained in the match codes is a function of the number of digits in the base number and the match sensitivity.

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Postal Code
Description The Phone match definition generates match codes which can be used to cluster records containing phone numbers.
Max Length of Match Code 15 characters
Examples Input Cluster ID
s88tj 1
58 8tj 1
S8 8TJ 1
A8 8TJ 2
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Text
Description The Text match definition generates match codes which can be used to cluster records containing general text strings.
Max Length of Match Code 15 characters
Examples Input Cluster ID
they went 1
you are 2
you're 2
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

Parse Definitions

Address
Description The Parse definition for Address parses address first line data.
Output Tokens Extension
Building/Sub-Street
Street
District/Village
Additional Info
Example 1 Input Output
1st Floor Rennie House Stamford Street Extension 1st Floor
Building/Sub-Street Rennie House
Street Stamford Street
District/Village  
Additional Info  
Example 2 Input Output
5 Chilton Cottages, Queen's Road Princes Risborough Buckinghamshire Extension  
Building/Sub-Street 5 Chilton Cottages
Street Queen's Road
District/Village Princes Risborough
Additional Info  
Remarks  

 

Address (Full)
Description The Parse definition for Address (Full) parses full multi-line addresses.
Output Tokens Extension
Building/Sub-Street
Street
District/Village
Town/City
County
Country
Postcode
Additional Info
Example 1 Input Output
1st Floor Rennie House Stamford Street London SE1 9LL Extension 1st Floor
Building/Sub-Street Rennie House
Street Stamford Street
District/Village  
Town/City London
County  
Country  
Postcode SE1 9LL
Additional Info  
Example 2 Input Output
5 Chilton Cottages, Queen's Road Princes Risborough Buckinghamshire Extension  
Building/Sub-Street 5 Chilton Cottages
Street Queen's Road
District/Village  
Town/City Princes Risborough
County Buckinghamshire
Country  
Postcode  
Additional Info  
Remarks  

 

Address (Global)
Description

The Address (Global) parse definition parses addresses into a globally recognized set of tokens.

Output Tokens Recipient
Building/Site
Street
Extension
PO Box
Additional Info
  Input Output
Example 1 The Chapter House, 32 Upton Rd Recipient  
Building/Site The Chapter House
Street 32 Upton Rd
Extension  
PO Box  
Additional Info  
  Input Output
Example 2 38 Newport Road Recipient  
Building/Site  
Street 38 Newport Road
Extension  
PO Box  
Additional Info  
Remarks Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales.

The Address (Global) (v23) parse definition is now deprecated and will be removed in a future release of the QKB.

The Address (Global) parse definition has been replaced with a copy of the Address (Global) (v23) definition which takes advantage of the new tokens and updated processing. If you changed your jobs to use Address (Global) (v23) it is suggested that you change them back.

 

Address (Global) (v23)
Description

The Address (Global) (v23) parse definition parses addresses into a globally recognized set of tokens.

Output Tokens Recipient
Building/Site
Street
Extension
PO Box
Additional Info
  Input Output
Example 1 The Chapter House, 32 Upton Rd Recipient  
Building/Site The Chapter House
Street 32 Upton Rd
Extension  
PO Box  
Additional Info  
  Input Output
Example 2 38 Newport Road Recipient  
Building/Site  
Street 38 Newport Road
Extension  
PO Box  
Additional Info  
Remarks Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales.

The Address (Global) (v23) parse definition is now deprecated and will be removed in a future release of the QKB.

The Address (Global) parse definition has been replaced with a copy of the Address (Global) (v23) definition which takes advantage of the new tokens and updated processing. If you changed your jobs to use Address (Global) (v23) it is suggested that you change them back.

 

City - State/Province - Postal Code
Description The Parse definition for City - State/Province - Postal Code parses address last line data.
Output Tokens District/Village
Town/City
County
Country
Postcode
Additional Info
Example 1 Input Output
London SE1 9LL District/Village  
Town/City London
County  
Country  
Postcode SE1 9LL
Additional Info  
Example 2 Input Output
Princes Risborough Buckinghamshire District/Village  
Town/City Princes Risborough
County Buckinghamshire
Country  
Postcode  
Additional Info  
Remarks  

 

City - State/Province - Postal Code (Global)
Description The Parse definition for City - State/Province - Postal Code (Global) parses address "last line" data into a globally recognized set of tokens.
Output Tokens City
State/Province
Postal Code
Additional Info
Example Input Output
Sheffield, South Yorkshire, S8 8TJ City Sheffield
State/Province South Yorkshire
Postal Code S8 8TJ
Additional Info  
Remarks Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales.

 

Phone
Description The Parse definition for Phone parses phone numbers into a set of tokens.
Output Tokens Country Code
Area Code
Base Number
Extension
Line Type
Additional Info
Example 1 Input Output
Work: 00442012345000 Ext 678 (ask for Mary) Country Code 0044
Area Code 20
Base Number 12345000
Extension 678
Line Type Work:
Additional Info (ask for Mary)
Example 2 Input Output
(0)20 87817200 Country Code  
Area Code (0)20
Base Number 87817200
Extension  
Line Type  
Additional Info  
Example 3 Input Output
+1 (919) 447-3000 Country Code +1
Area Code  
Base Number (919) 447-3000
Extension  
Line Type  
Additional Info  
Example 4 Input Output
07412 123456 (mobile) Country Code  
Area Code 07412
Base Number 123456
Extension  
Line Type (mobile)
Additional Info  
Example 5 Input Output
01628 486933 (home) Country Code  
Area Code 01628
Base Number 486933
Extension  
Line Type (home)
Additional Info  
Remarks  

 

Phone (Global)
Description The Parse definition for Phone (Global) parses phone numbers into a globally recognized set of tokens.
Output Tokens Country Code
Area Code
Base Number
Extension
Line Type
Additional Info
Example 1 Input Output
Work: 00442012345000 Ext 678 (ask for Mary) Country Code 0044
Area Code 20
Base Number 12345000
Extension 678
Line Type Work:
Additional Info (ask for Mary)
Example 2 Input Output
(0)20 87817200 Country Code  
Area Code (0)20
Base Number 87817200
Extension  
Line Type  
Additional Info  
Example 3 Input Output
+1 (919) 447-3000 Country Code +1
Area Code  
Base Number (919) 447-3000
Extension  
Line Type  
Additional Info  
Example 4 Input Output
07412 123456 (mobile) Country Code  
Area Code 07412
Base Number 123456
Extension  
Line Type (mobile)
Additional Info  
Example 5 Input Output
01628 486933 (home) Country Code  
Area Code 01628
Base Number 486933
Extension  
Line Type (home)
Additional Info  
Remarks Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales.

Pattern Analysis Definitions

None.

Standardization Definitions

Address
Description The Standardization definition for Address standardizes address "first line" data.
Example Input Output
47 Marlow Road 47 Marlow Rd
Remarks  

 

City
Description The Standardization definition for City standardizes city names.
Example Input Output
nhampton Northhampton
Remarks  

 

City - State/Province - Postal Code
Description The Standardization definition for City - State/Province - Postal Code standardizes address "last line" data.
Example Input Output
Sheffield, South Yorks., S8 8TJ SHEFFIELD South Yorkshire S8 8TJ
Remarks  

 

Phone
Description The Standardization definition for Phone standardizes phone numbers for domestic use.
Examples Input Output
+44 (0) 1753 272 020 Ext 12 (01753) 272 020 x12
01628-486933 (Work) (01628) 486 933, Work
+33 (0)5-201-23-456 +33 520123456
0033 520123456 +33 520123456
020 26828673 (020) 2682 8673
(07412) 123456 07412 123 456
Remarks Optional geographic area codes are surrounded by parentheses, and non-optional geographic area codes are not surrounded by parentheses, as shown in the final two examples.

 

Phone (with Country Code)
Description The Standardization definition for Phone (with Country Code) standardizes phone numbers for international use.
Examples Input Output
01628-486933 (Work) +44 1628 486 933, Work
+44 (0) 1753 272 020 Ext 12 +44 1753 272 020 x12
00330520123456 +33 520123456
+33 (0)5-201-23-456 +33 520123456
(020) 1234 5678 +44 20 1234 5678
Remarks  

 

Phone (Electronic)
Description The Standardization definition for Phone (Electronic) standardizes phone numbers for automated calling systems.
Examples Input Output
Work: 0044 (0)20-1234-5000 Ext 678 (ask for Mary) +442012345000
Fax: 0141 332 7226 +441413327226
0141 332 7226 (after 8pm) +441413327226
+44 191-*4296738 +441914296738
+33 (0)5-201-23-456 +33520123456
(020) 1234 5678 +442012345678
+1-800-DATAFLUX (USA number) +180032823589
Remarks  

 

Postal Code
Description The Standardization definition for Postal Code standardizes postal codes.
Example Input Output
s88tj S8 8TJ
Remarks The Standardization definition for Postal Code strips domestic Postal Country Codes.

 

Postal Code (with Country Code)
Description The Standardization definitions for Postal Code (with Country Code) standardizes postal codes for International use.
Example Input Output
s88tj GB-S8 8TJ
Remarks The Standardization definition for Postal Code (with Country Code) adds domestic Postal Country Codes.

Inherited Definitions

In addition to the definitions listed on this page, the English, United Kingdom locale also inherits all definitions for the English language and all Global definitions.