You are here: Definitions>English Definitions>English, South Africa Definitions

SAS Quality Knowledge Base for Contact Information 25

English, South Africa Definitions

Definitions for the English, South Africa locale are described below.

Case Definitions
Gender Analysis Definitions

Identification Analysis Definitions

Match Definitions

Parse Definitions

Pattern Analysis Definitions

Standardization Definitions

Inherited Definitions

Case Definitions

Proper (PostBox)
Description The Case definition for Proper (PostBox) propercases PostBox information in an address.
Examples Input Output
PO BOX 18940 PO Box 18940
Remarks  

Gender Analysis Definitions

ID Number
Description The Gender Analysis definition for ID Number determines an individual's gender based on a personal ID Number.
Possible Outputs M
F
U
Examples Input Output
7604210011088 F
7604206143085 M
Remarks  

Identification Analysis Definitions

Organization Registration
Description The Identification Analysis definition for Organization Registration determines whether a string represents an organization registration number.
Possible Outputs ORG_REG
ID
UNKNOWN
Examples Input Output
CC/94/00199 ORG_REG
7604210011088 ID
ABC UNKNOWN
Remarks The result ID means that the string represents a personal ID number.

Match Definitions

Address
Description The Address match definition generates match codes which can be used to cluster records containing addresses.
Max Length of Match Code 77 characters
Examples Input Cluster ID
173 BLOUWILDEBEES STRAAT 0
173 Blouwildebees St 0
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Address (Full)
Description The Address (Full) match definition generates match codes which can be used to cluster records containing complete two-line addresses.
Max Length of Match Code 110 characters
Examples Input Cluster ID
SANLAM CENTRE, C/O JEPPE & VON WIELLIGH STS, JOHANNESBURG, 2001 1
4TH FLOOR SANLAM CENTRE, CNR JEPPE & VON WIELLIGH STR, JOBURG, 2001 1
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

City
Description The City match definition generates match codes which can be used to cluster records containing city names.
Max Length of Match Code 15 characters
Examples Input Cluster ID
BANTRYBAAI 0
BANTRY BAY 0
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

City - State/Province - Postal Code
Description The City - State/Province - Postal Code match definition generates match codes which can be used to cluster records containing last line address information.
Max Length of Match Code 41 characters
Examples Input Cluster ID
Tygervalley 7536 1
Tyger Valley, 7536 1
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

ID Number
Description The ID Number match definition generates match codes which can be used to cluster records containing ID card numbers.
Max Length of Match Code 15 characters
Examples Input Cluster ID
5705025087080 0
5705025087056 0
Remarks

Fuzziness is built in by ignoring digits near the end of the number.

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Organization
Description The Organization match definition generates match codes which can be used to cluster records containing organization names.
Max Length of Match Code 40 characters
Examples Input Cluster ID
DataFlux Corporation 0
DataFlex LLC 0
SAS Institute 0
SAS Institute, Canada 0
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Organization Registration
Description The Organization Registration match definition generates match codes which can be used to cluster records containing organization registration numbers.
Max Length of Match Code 27 characters
Examples Input Cluster ID
1989 / 008114/23 1
1989/008114/23 1
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Organization (with Site)
Description The Organization (with Site) match definition generates match codes which can be used to cluster records containing organization names including the site information in match codes.
Max Length of Match Code 30 characters
Examples Input Cluster ID
MOMENTUM GROUP LTD M.C.B 1
The MOMENTUM GROUP 1
Momentum Group Limited 1
NABUILD (PTY) LTD - KENILWORTH 2
Nabuild (Pty) Ltd - Empangeni 3
Remarks

Organization names at different sites will not match if this definition is used.

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Phone
Description The Phone match definition generates match codes which can be used to cluster records containing phone numbers.
Max Length of Match Code 15 characters
Examples Input Cluster ID
011 4890292 0
+27 (011) 4890292 0
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Postal Code
Description The Postal Code match definition generates match codes which can be used to cluster records containing postal codes.
Max Length of Match Code 15 characters
Examples Input Cluster ID
3370 1
'3370' 1
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

Parse Definitions

Address
Description The Parse definition for Address parses addresses into a set of tokens.
Output Tokens Prefix
Building Group
Building Name
Street/Erf/Postbox
Additional Info
Example 1 Input Output
173 Blouwildebees Straat Unit 2 Prefix  
Building Group  
Building Name  
Street/Erf/Postbox 173 Blouwildebees Straat
Additional Info Unit 2
Example 2 Input Output
JEPPE MENS HOSTEL L ROOM 136 BLOCK 2 100 IMPALA STREET Prefix L ROOM 136
Building Group JEPPE MENS HOSTEL
Building Name BLOCK 2
Street/Erf/Postbox 100 IMPALA STREET
Additional Info  
Remarks  

 

Address (Full)
Description The Parse definition for Address (Full) parses a full two-line address into a set of tokens.
Output Tokens Prefix
Building Group
Building Name
Street/Erf/Postbox
Suburb/Township
Town
Area/Metropol
Province
Postcode
Additional Info
Example 1 Input Output
4TH FLOOR SANLAM CENTRE, CNR JEPPE & VON WIELLIGH STR, JOHANNESBURG, 2001 Prefix 4TH FLOOR
Building Group  
Building Name SANLAM CENTRE
Street/Erf/Postbox CNR JEPPE & VON WIELLIGH STR
Suburb/Township  
Town  
Area/Metropol JOHANNESBURG
Province  
Postcode 2001
Additional Info  
Example 2 Input Output
JEPPE MENS HOSTEL, L ROOM 136, BLOCK 2 EAST, 100 IMPALA STREET, JEPPESTOWN, JOHANNESBURG, 2001 Prefix L ROOM 136
Building Group JEPPE MENS HOSTEL
Building Name BLOCK 2 EAST
Street/Erf/Postbox 100 IMPALA STREET
Suburb/Township JEPPESTOWN
Town  
Area/Metropol JOHANNESBURG
Province  
Postcode 2001
Additional Info  
Remarks  

 

Address (Global)
Description

The Address (Global) parse definition parses addresses into a globally recognized set of tokens.

Output Tokens Recipient
Building/Site
Street
Extension
PO Box
Additional Info
Example 1 Input Output
JEPPE MENS HOSTEL L ROOM 136 BLOCK 2 100 IMPALA STREET Recipient  
Building/Site BLOCK 2 JEPPE MENS HOSTEL
Street 100 IMPALA STREET
Extension L ROOM 136
PO Box  
Additional Info  
Example 2 Input Output
173 Blouwildebees Straat, P.bus 123 Recipient  
Building/Site  
Street 173 Blouwildebees Straat
Extension  
PO Box P.bus 123
Additional Info  
  Remarks

Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales.

The Address (Global) (v23) parse definition is now deprecated and will be removed in a future release of the QKB.

The Address (Global) parse definition has been replaced with a copy of the Address (Global) (v23) definition which takes advantage of the new tokens and updated processing. If you changed your jobs to use Address (Global) (v23) it is suggested that you change them back.

 

Address (Global) (v23)
Description The Address (Global) (v23) parse definition parses addresses into a globally recognized set of tokens.
Output Tokens Recipient
Building/Site
Street
Extension
PO Box
Additional Info
Example 1 Input Output
JEPPE MENS HOSTEL L ROOM 136 BLOCK 2 100 IMPALA STREET Recipient  
Building/Site BLOCK 2 JEPPE MENS HOSTEL
Street 100 IMPALA STREET
Extension L ROOM 136
PO Box  
Additional Info  
Example 2 Input Output
173 Blouwildebees Straat, P.bus 123 Recipient  
Building/Site  
Street 173 Blouwildebees Straat
Extension  
PO Box P.bus 123
Additional Info  
  Remarks

Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales.

The Address (Global) (v23) parse definition is now deprecated and will be removed in a future release of the QKB.

The Address (Global) parse definition has been replaced with a copy of the Address (Global) (v23) definition which takes advantage of the new tokens and updated processing. If you changed your jobs to use Address (Global) (v23) it is suggested that you change them back.

 

City - State/Province - Postal Code
Description The Parse definition for City - State/Province - Postal Code parses address "last line" data into a set of tokens.
Output Tokens Suburb/Township
Town
Area/Metropol
Province
Postcode
Additional Info
Example Input Output
LOWER HOUGHTON, JOHANNESBURG 2198 Suburb/Township LOWER HOUGHTON
Town  
Area/Metropol JOHANNESBURG
Province  
Postcode 2198
Additional Info  
Remarks  

 

City - State/Province - Postal Code (Global)
Description The Parse definition for City - State/Province - Postal Code (Global) parses address "last line" data into a globally recognized set of tokens.
Output Tokens City
State/Province
Postal Code
Additional Info
Example Input Output
LOWER HOUGHTON, JOHANNESBURG 2198 City LOWER HOUGHTON JOHANNESBURG
State/Province  
Postal Code 2198
Additional Info  
Remarks Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales.

 

ID Number
Description The Parse definition for ID Number parses personal identification numbers into a set of tokens.
Output Tokens Year
Month
Day
Gender
DOB/Gender Sequence Number
Citizenship
Check Digit
Example Input Output
7604206147185 Year 76
Month 04
Day 20
Gender 6
DOB/Gender Sequence Number 147
Citizenship 18
Check Digit 5
Remarks  

 

Organization Registration
Description The Parse definition for Organization Registration parses organization registration numbers into a set of tokens.
Output Tokens Year
Sequence
Type
Alphabetic Info
Additional Info
Example 1 Input Output
1997/123456/07 Year 1997
Sequence 123456
Type 07
Alphabetic Info  
Additional Info  
Example 2 Input Output
7604206147185 Year  
Sequence  
Type  
Alphabetic Info  
Additional Info 7604206147185
Remarks The Additional Info token is used to store incorrect or extra information in the string.

 

Phone
Description The Parse definition for Phone parses South African phone numbers into a set of tokens.
Output Tokens Prefix
Country Code
Area Code
Exchange
Station
Extension ID
Extension
Suffix
Example 1 Input Output
Mobile 27 (11) 123-4567 ext 23 Prefix Mobile
Country Code 27
Area Code 11
Exchange 123
Station 4567
Extension ID ext
Extension 23
Suffix  
Example 2 Input Output
011 555-9987 Home Prefix  
Country Code  
Area Code 011
Exchange 555
Station 9987
Extension ID  
Extension  
Suffix Home
Remarks  

 

Phone (Global)
Description The Parse definition for Phone (Global) parses phone numbers into a globally recognized set of tokens.
Output Tokens Country Code
Area Code
Base Number
Extension
Line Type
Additional Info
Example 1 Input Output
Mobile 27 (11) 123-4567 ext 23 Country Code 27
Area Code 11
Base Number 123-4567
Extension ext 23
Line Type Mobile
Additional Info  
Example 2 Input Output
011 555-9987 Home Country Code  
Area Code 011
Base Number 555-9987
Extension  
Line Type Home
Additional Info  
Remarks Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales.

 

Phone (Multiple Number)
Description The Parse definition for Phone (Multiple Number) parses strings that contain one, two, or three South African phone numbers into a set of tokens.
Output Tokens Phone 1
Phone 2
Phone 3
Example 1 Input Output
(17) 610-8555 / (17) 611-9772 Phone 1 17 610 8555
Phone 2 17 611 9772
Phone 3  
Example 2 Input Output
+27 (11) 392-1646/7/8 Phone 1 27 11 392 164 6
Phone 2 27 11 392 164 7
Phone 3 27 11 392 164 8
Example 3 Input Output
+27(22) 972-1600 Phone 1 27 22 972 1600
Phone 2  
Phone 3  
Remarks Some loss of punctuation may occur when this definition is used.

Pattern Analysis Definitions

None.

Standardization Definitions

Address
Description The Standardization definition for Address standardizes address data.
Examples Input Output
14 NAPIER RD SETTLERS HEIGHTS Settlers Heights, 14 Napier Road
Impalastraat 36 36 Impala Street
Remarks  

 

Address (Full)
Description The Standardization definition for Address (Full) standardizes full two-line address data.
Example Input Output
IMPALASTRAAT 36 SEC J MAMELODI WEST,PRET 2941 36 Impala Street, Section J, Mamelodi, Pretoria, 2941
Remarks  

 

City
Description The Standardization definition for City standardizes city names.
Examples Input Output
JHB Johannesburg
Capetown Cape Town
Remarks  

 

City - State/Province - Postal Code
Description The Standardization definition for City - State/Province - Postal Code standardizes address "last line" data.
Examples Input Output
ST HELENABAAI 7390 St Helena Bay, 7390
MOUNT FRERE,5090 Mt Frere, 5090
Remarks  

 

Phone
Description The Standardization definition for Phone standardizes South African phone numbers.
Example Input Output
27119785895 +27 (011) 978-5895
Remarks  

 

Postal Code
Description The Standardization definition for Postal Code standardizes postal codes.
Example Input Output
'3370' 3370
Remarks  

Inherited Definitions

In addition to the definitions listed on this page, the English, South Africa locale also inherits all definitions for the English language and all Global definitions.