SAS Quality Knowledge Base for Contact Information 26
Definitions for the Malay, Malaysia locale are described below.
Case Definitions
Extraction Definitions
Gender Analysis Definitions
Identification Analysis Definitions
Match Definitions
Parse Definitions
Pattern Analysis Definitions
Standardization Definitions
Inherited Definitions
Proper (Address) | ||
---|---|---|
Description | The Proper (Address) case definition propercases addresses. | |
Input | Output | |
Example | NO 2, JALAN SS2/14 po box 125 | No 2, Jalan SS2/14 PO Box 125 |
Remarks |
Proper (Name) | ||
---|---|---|
Description | The Proper (Name) case definition propercases names of individuals. | |
Input | Output | |
Examples | M. KUPPUSAMY A/L MUTHUSAMY | M. Kuppusamy a/l Muthusamy |
nurul binti normawati | Nurul binti Normawati | |
YB DATIN URVID | YB Datin Urvid | |
Remarks |
Proper (Organization) | ||
---|---|---|
Description | The Proper (Organization) case definition propercases organization names. | |
Input | Output | |
Examples | A-tech institute | A-Tech Institute |
BSN MERCHANT BANK BHD | BSN Merchant Bank Bhd | |
Remarks | This definition uses a list of known organization names to handle exceptions to propercasing rules. |
None.
Name | ||
---|---|---|
Description | The Name gender analysis definition determines the gender of a name. | |
Possible Outputs | M F U |
|
Input | Output | |
Examples | Prema Punita | F |
Kamal Lee bin Abdullah | M | |
S. Sothinathan | U | |
Remarks |
Name (Ethnicity) | ||
---|---|---|
Description |
The Name (Ethnicity) identification analysis definition identifies the ethnic background of an individual based on the individual's name. |
|
Possible Outputs | M C E I P O |
|
Input | Output | |
Examples | Azwar Baharudin | M |
Nang Ching Teck | C | |
John Smith | E | |
Anupam Arjun | I | |
Vinu Singh | P | |
Janusz Wojdecki | O | |
Remarks | M = Malay C = Chinese E = Eurasian I = Indian P = Punjabi O = Other |
Address | ||
---|---|---|
Description | The Address match definition generates match codes which can be used to cluster records containing addresses. | |
Max Length of Match Code | 20 characters | |
Input | Cluster ID | |
Examples | WISMA KLN, 10 JLN WONG AH FOOK | 0 |
10 Jalan Wong Ah Fook | 0 | |
2 JLN KASKAS | 1 | |
Remarks |
![]() |
City | ||
---|---|---|
Description | The City match definition generates match codes which can be used to cluster records containing city names. | |
Max Length of Match Code | 15 characters | |
Input | Cluster ID | |
Examples | KL | 0 |
K. Lumpur | 0 | |
Kota Kinabalu | 1 | |
Remarks |
![]() |
City - State/Province - Postal Code | ||
---|---|---|
Description | The City - State/Province - Postal Code match definition generates match codes which can be used to cluster records containing last line address information. | |
Max Length of Match Code | 32 characters | |
Input | Cluster ID | |
Examples | 46200 Petaling Jaya Selangor Darul Ehsan | 0 |
46200,PJ,Sel. D.E. | 0 | |
50450, K.LUMPUR | 1 | |
Remarks |
![]() |
Name | ||
---|---|---|
Description | The Name match definition generates match codes which can be used to cluster records containing names of individuals. | |
Max Length of Match Code | 20 characters | |
Input | Cluster ID | |
Examples | Caroline Yong Mei-Lin | 0 |
Miss Mei-Lin Yong | 0 | |
Yong Mei-Lin | 0 | |
Remarks |
![]() |
Organization | ||
---|---|---|
Description | The Organization match definition generates match codes which can be used to cluster records containing organization names. | |
Max Length of Match Code | 15 characters | |
Input | Cluster ID | |
Examples | SRIMANISA SDN BHD | 0 |
Agensi Pekerjaan Srimanisa Sdn Bhd | 0 | |
MCSB Systems Bhd | 1 | |
Remarks |
![]() |
Phone | ||
---|---|---|
Description | The Phone match definition generates match codes which can be used to cluster records containing phone numbers. | |
Max Length of Match Code | 15 characters | |
Input | Cluster ID | |
Examples | +603-64219754 | 0 |
60364219754 | 0 | |
03-79901655 | 1 | |
Remarks |
![]() |
Postal Code | ||
---|---|---|
Description | The Postal Code match definition generates match codes which can be used to cluster records containing postal codes. | |
Max Length of Match Code | 15 characters | |
Input | Cluster ID | |
Examples |
-46200 | 0 |
46200 | 0 | |
50450 | 1 | |
Remarks |
![]() |
Address | |||
---|---|---|---|
Description | The Address parse definition parses addresses into a set of tokens. | ||
Output Tokens | Unit Number Building Name Lot Number Street Type Street Name Additional Street Name Primary Neighborhood Secondary Neighborhood |
||
Input | Output Token | Output | |
Example | 5A Wisma Maria, 2, Jalan Kuchai 3, Taman Lian Hoe | Unit Number | 5A |
Building Name | Wisma Maria | ||
Lot Number | 2 | ||
Street Type | Jalan | ||
Street Name | Kuchai 3 | ||
Additional Street Name | |||
Primary Neighborhood | Taman Lian Hoe | ||
Secondary Neighborhood | |||
Remarks |
Address (Global) | |||
---|---|---|---|
Description |
The Address (Global) parse definition parses addresses into a globally recognized set of tokens. |
||
Output Tokens | Recipient Building/Site Street Extension PO Box Additional Info |
||
Input | Output Token | Output | |
Example | 5A Wisma Maria, 2, Jalan Kuchai 3, Taman Lian Hoe | Recipient | |
Building/Site | Wisma Maria | ||
Street | 2, Jalan Kuchai 3 | ||
Extension | 5A | ||
PO Box | |||
Additional Info | Taman Lian Hoe | ||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
City - State/Province - Postal Code | |||
---|---|---|---|
Description | The City - State/Province - Postal Code parse definition parses last line address information into a set of tokens. | ||
Output Tokens | Postal Code Neighborhood City State |
||
Input | Output Token | Output | |
Example | 47400 Petaling Jaya Selangor Darul Ehsan | Postal Code | 47400 |
Neighborhood | |||
City | Petaling Jaya | ||
State | Selangor Darul Ehsan | ||
Remarks |
City - State/Province - Postal Code (Global) | |||
---|---|---|---|
Description | The City - State/Province - Postal Code (Global) parse definition parses last line address information into a globally recognized set of tokens. | ||
Output Tokens | City State/Province Postal Code Additional Info |
||
Input | Output Token | Output | |
Example | 47400 Petaling Jaya Selangor Darul Ehsan | City | Petaling Jaya |
State/Province | Selangor Darul Ehsan | ||
Postal Code | 47400 | ||
Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
Name | |||
---|---|---|---|
Description | The Name parse definition parses names of individuals into a set of tokens. | ||
Output Tokens | Prefix Given Name Middle Name Family Name Suffix Title/Additional Info |
||
Input | Output Token | Output | |
Example 1 | Miss Yong Mei-Lin | Prefix | Miss |
Given Name | Mei-Lin | ||
Middle Name | |||
Family Name | Yong | ||
Suffix | |||
Title/Additional Info | |||
Input | Output Token | Output | |
Example 2 | En Burhan Basir a/l Asmi Basir | Prefix | En |
Given Name | Burhan Basir | ||
Middle Name | |||
Family Name | a/l Asmi Basir | ||
Suffix | |||
Title/Additional Info | |||
Remarks |
Name (Global) | |||
---|---|---|---|
Description | The Name (Global) parse definition parses names of individuals into a globally recognized set of tokens. | ||
Output Tokens | Prefix Given Name Middle Name Family Name Suffix Title/Additional Info |
||
Input | Output Token | Output | |
Example 1 | Miss Yong Mei-Lin | Prefix | Miss |
Given Name | Mei-Lin | ||
Middle Name | |||
Family Name | Yong | ||
Suffix | |||
Title/Additional Info | |||
Input | Output Token | Output | |
Example 2 | En Burhan Basir a/l Asmi Basir | Prefix | En |
Given Name | Burhan Basir | ||
Middle Name | |||
Family Name | a/l Asmi Basir | ||
Suffix | |||
Title/Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
Organization | |||
---|---|---|---|
Description | The Organization parse definition parses organization names into a set of tokens. | ||
Output Tokens | Name Legal Form Registration Number Site Additional Info |
||
Input | Output Token | Output | |
Example | SAS Institute Sdn Bhd | Name | SAS Institute |
Legal Form | Sdn Bhd | ||
Registration Number | |||
Site | |||
Additional Info | |||
Remarks |
Phone | |||
---|---|---|---|
Description | The Phone parse definition parses phone numbers into a set of tokens. | ||
Output Tokens | Prefix Country Code Area Code Base Number Extension |
||
Input | Output Token | Output | |
Example | (603) 7981-4655 | Prefix | |
Country Code | 60 | ||
Area Code | 3 | ||
Base Number | 7981-4655 | ||
Extension | |||
Remarks |
Phone (Global) | |||
---|---|---|---|
Description | The Phone (Global) parse definition parses phone numbers into a globally recognized set of tokens. | ||
Output Tokens | Country Code Area Code Base Number Extension Line Type Additional Info |
||
Input | Output Token | Output | |
Example | (603) 7981-4655 | Country Code | 6 0 |
Area Code | 3 | ||
Base Number | 7981 4655 | ||
Extension | |||
Line Type | |||
Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
None.
Address | ||
---|---|---|
Description | The Address standardization definition standardizes addresses. | |
Input | Output | |
Examples | 2, Jln Kuchai 3 Tmn Lian Hoe | 2, Jalan Kuchai 3, Taman Lian Hoe |
jln ss15/5a |
Jalan SS15/5a | |
Remarks |
City | ||
---|---|---|
Description | The City standardization definition standardizes city names. | |
Input | Output | |
Examples | kl | Kuala Lumpur |
Jhr. | Johor | |
Remarks | Common city abbreviations are expanded into full names. |
City - State/Province - Postal Code | ||
---|---|---|
Description | The City - State/Province - Postal Code standardization definition standardizes last line address information. | |
Input | Output | |
Example | KL 54000, KL | 54000 Kuala Lumpur, Wilayah Persekutuan |
Remarks |
Name | ||
---|---|---|
Description | The Name standardization definition standardizes names of individuals. | |
Input | Output | |
Examples | Miss Mei-Lin Yong | Cik Mei-Lin Yong |
EN DIVYENDU EKNATH | Encik Divyendu Eknath | |
Remarks |
Organization | ||
---|---|---|
Description | The Organization standardization definition standardizes organization names. | |
Input | Output | |
Examples | SAS Institute Sdn Berhad | SAS Institute Sdn Bhd |
B.I.M.I.T. COLLEGE | BIMIT College | |
Remarks |
Phone | ||
---|---|---|
Description | The Phone standardization definition standardizes phone numbers for domestic use. | |
Input | Output | |
Examples | (603) 7981-4655 | +603 79814655 |
+6 019-8222123 | +6019 8222123 | |
Remarks |
Postal Code | ||
---|---|---|
Description | The Postal Code standardization definition standardizes postal codes. | |
Input | Output | |
Examples | 50450. | 50450 |
-50450 | 50450 | |
Remarks |
In addition to the definitions listed on this page, the Malay, Malaysia locale also inherits all definitions for the Malay language and all Global definitions.
Documentation Feedback: yourturn@sas.com |
Doc ID: QKBCI_MSMYS_defs.html |