SAS Quality Knowledge Base for Contact Information 26
Definitions for the Korean, South Korea locale are described below.
Case Definitions
Extraction Definitions
Gender Analysis Definitions
Identification Analysis Definitions
Match Definitions
Parse Definitions
Pattern Analysis Definitions
Standardization Definitions
Inherited Definitions
Proper (Name) | ||
---|---|---|
Description | The Proper (Name) case definition propercases names of individuals rendered in Latin. | |
Examples | Input | Output |
ALFRED VON KNORRING | Alfred von Knorring | |
james mcdonald | James McDonald | |
Remarks |
Proper (Organization) | ||
---|---|---|
Description |
The Proper (Organization) case definition propercases organization names rendered in Latin. |
|
Examples | Input | Output |
JOHNSON and JOHNSON | Johnson and Johnson | |
sas institute korea | SAS Institute Korea | |
Remarks |
None.
Social Security Number | ||
---|---|---|
Description |
The Social Security Number gender analysis definition determines the gender of an individual based on their social security number. |
|
Possible Outputs | M F U |
|
Examples | Input | Output |
070614-3234563 | M | |
771114-2123458 | F | |
800101-5123456 | U | |
Remarks |
Description |
The E-mail identification analysis definition determines whether an e-mail address is valid or not. |
|
Possible Outputs | VALID INVALID |
|
Examples | Input | Output |
jqqa5112@hanmail.com | INVALID | |
jes7657@korea.com | VALID | |
Remarks |
Social Security Number | ||
---|---|---|
Description |
The Social Security Number identification analysis definition identifies the nationality of an individual based on their social security number. |
|
Possible Outputs | FOREIGN KOREAN UNKNOWN |
|
Examples | Input | Output |
070614-3234563 | KOREAN | |
790220-6106122 | FOREIGN | |
680914-5088712 | UNKNOWN | |
Remarks |
Address | ||
---|---|---|
Description | The Address match definition generates match codes which can be used to cluster records containing addresses. | |
Max Length of Match Code | 47 characters | |
Examples | Input | Cluster ID |
1120번지 우미파크빌 505동802호 | 0 | |
1120 우미파크빌 505-802 | 0 | |
1135번지 옥빛마을@ 1712-304호 | 1 | |
1135 옥빛마을아 1712-304 | 1 | |
Remarks |
|
Address (Apartment Only) | ||
---|---|---|
Description | The Address (Apartment Only) match definition generates match codes which can be used to cluster records containing apartment information. | |
Max Length of Match Code | 47 characters | |
Examples | Input | Cluster ID |
107-44 우민아파트 608호 | 0 | |
우민아.608호 | 0 | |
Remarks |
|
|
The Address (Apartment Only) match definition ignores street and PO Box information. |
Address (Full) | ||
---|---|---|
Description | The Address (Full) match definition generates match codes which can be used to cluster records containing complete two-line addresses. | |
Max Length of Match Code | 60 characters | |
Examples | Input | Cluster ID |
서울특별시 강남구 역삼동 701-2번지 삼정개발빌딩 17층 | 0 | |
(135-080) 서울 강남구 역삼동 701-2 삼정개발빌딩 17층 | 0 | |
인천 남동구 남촌동 610-9 남동 공단 36블록 10롯트 | 1 | |
(405-846) 인천 남동구 남촌동 610-9 36블럭 10로트 | 1 | |
Remarks |
|
Address (Full) (Apartment Only) | ||
---|---|---|
Description | The Address (Full) (Apartment Only) match definition generates match codes which can be used to cluster records containing complete two-line addresses that include apartment information. | |
Max Length of Match Code | 60 characters | |
Examples | Input | Cluster ID |
서울 중구 태평로2가 250 삼성본관 18층 | 0 | |
서울중구태평로2가삼성본관빌딩 18층 | 0 | |
Remarks |
|
|
The Address (Full) (Apartment Only) match definition ignores street and PO Box information. |
Address (Full) (PO Box Only) | ||
---|---|---|
Description | The Address (Full) (PO Box Only) match definition generates match codes which can be used to cluster records containing complete two-line addresses that include PO Box information. | |
Max Length of Match Code | 35 characters | |
Examples | Input | Cluster ID |
경기도 성남시 수정구 창곡동 사서함 122-11호 | 0 | |
경기도 성남시 수정구 창곡동 우체국 사서함 122-11호 | 0 | |
Remarks |
|
|
This match definition ignores street and building information. |
Address (Full) (Street Only) | ||
---|---|---|
Description | The Address (Full) (Street Only) match definition generates match codes which can be used to cluster records containing complete two-line addresses that include street information. | |
Max Length of Match Code | 60 characters | |
Examples | Input | Cluster ID |
서울특별시 강남구 역삼동 701-2번지 삼정개발빌딩 17층 | 0 | |
(135-080) 서울 강남구 역삼동 701-2 삼정개발빌딩 17층 | 0 | |
Remarks |
|
|
This match definition ignores building and PO Box information. |
Address (PO Box Only) | ||
---|---|---|
Description | The Address (PO Box Only) match definition generates match codes which can be used to cluster records containing the PO Box portion of an address. | |
Max Length of Match Code | 22 characters | |
Examples | Input | Cluster ID |
사서함 122-11호 | 0 | |
유성우체국 사서함 35호 | 1 | |
사서함 35호 | 1 | |
Remarks |
|
|
This match definition ignores street and building information. |
Address (Street Only) | ||
---|---|---|
Description | The Address (Street Only) match definition generates match codes which can be used to cluster records containing the street portion of an address. | |
Max Length of Match Code | 47 characters | |
Examples | Input | Cluster ID |
101-1 우성아파트 12-1002 | 0 | |
101-1번지 우성아 12-1002 | 0 | |
Remarks |
|
|
This match definition ignores building and PO Box information. |
City | ||
---|---|---|
Description | The City match definition generates match codes which can be used to cluster records containing city names. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
고양시 덕양구 | 0 | |
광주 | 1 | |
광주시 | 1 | |
Remarks |
|
City - State/Province - Postal Code | ||
---|---|---|
Description | The City - State/Province - Postal Code match definition generates match codes which can be used to cluster records containing last line address information. | |
Max Length of Match Code | 28 characters | |
Examples | Input | Cluster ID |
서울특별시 송파구 신천동 | 0 | |
(138-728) 서울 송파구 신천동 | 0 | |
138-240 서울시 송파구 신천동 | 0 | |
Remarks |
|
Name | ||
---|---|---|
Description | The Name match definition generates match codes which can be used to cluster records containing names of individuals. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
조재용 | 0 | |
조제용 | 0 | |
조재용 귀하 | 0 | |
Remarks |
|
Organization | ||
---|---|---|
Description | The Organization match definition generates match codes which can be used to cluster records containing organization names. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
주식회사현대백화점 | 0 | |
현대백화점(HDSI) | 0 | |
LGT | 1 | |
(주)LG텔레콤 | 1 | |
Remarks |
|
Phone | ||
---|---|---|
Description | The Phone match definition generates match codes which can be used to cluster records containing phone numbers. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
011-9143-0614 | 0 | |
01191430614 | 0 | |
(02)26136711 | 1 | |
Remarks |
|
Postal Code | ||
---|---|---|
Description | The Postal Code match definition generates match codes which can be used to cluster records containing postal codes. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
(135-940) | 0 | |
135940 | 0 | |
Remarks |
|
Social Security Number | ||
---|---|---|
Description | The Social Security Number match definition generates match codes which can be used to cluster records containing Social Security Numbers. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
070614-3234563 | 0 | |
0706143234563 | 0 | |
771114-2123458 | 1 | |
Remarks |
|
Address | |||
---|---|---|---|
Description | The Address parse definition parses addresses into a set of tokens. | ||
Output Tokens | Neighborhood Village Island Street Number Block/Lot Building/Apartment Name Building/Apartment Number Extension Additional Info |
||
Example 1 | Input | Output Token | Output |
40 갤러리아 팰리스 B동 2503 | Neighborhood | ||
Village | |||
Island | |||
Street Number | 40 | ||
Block/Lot | |||
Building/Apartment Name | 갤러리아 팰리스 | ||
Building/Apartment Number | B동 2503 | ||
Extension | |||
Additional Info | |||
Example 2 | Input | Output Token | Output |
158번지 두진빌딩4층 (주)P&M | Neighborhood | ||
Village | |||
Island | |||
Street Number | 158번지 | ||
Block/Lot | |||
Building/Apartment Name | 두진빌딩 | ||
Building/Apartment Number | |||
Extension | 4층 | ||
Additional Info | (주)P&M | ||
Remarks |
Address (Full) | |||
---|---|---|---|
Description |
The Address (Full) parse definition parses addresses containing complete two-line addresses into a set of tokens. |
||
Output Tokens | Postal Code Metropol City/County/District Neighborhood Village Island Street Number Block/Lot Building/Apartment Name Building/Apartment Number Extension Additional Info |
||
Example 1 | Input | Output Token | Output |
(135-940) 서울 강남구 개포동 대청아파트 301-810 | Postal Code | (135-940) | |
Metropol | 서울 | ||
City/County/District | 강남구 | ||
Neighborhood | 개포동 | ||
Village | |||
Island | |||
Street Number | |||
Block/Lot | |||
Building/Apartment Name | 대청아파트 | ||
Building/Apartment Number | 301-810 | ||
Extension | |||
Additional Info | |||
Example 2 | Input | Output Token | Output |
경기 성남시 분당 정자동29번지 경남빌라 104-201 | Postal Code | ||
Metropol | 경기 | ||
City/County/District | 성남시 분당 | ||
Neighborhood | 정자동 | ||
Village | |||
Island | |||
Street Number | 29번지 | ||
Block/Lot | |||
Building/Apartment Name | 경남빌라 | ||
Building/Apartment Number | 104-201 | ||
Extension | |||
Additional Info | |||
Remarks |
Address (Global) | |||
---|---|---|---|
Description |
The Address (Global) parse definition parses addresses into a globally recognized set of tokens. |
||
Output Tokens | Recipient Building/Site Street Extension PO Box Additional Info |
||
Example | Input | Output Token | Output |
731번지 한솔아파트 301-201 | Recipient | ||
Building/Site | 한솔아파트 301-201 | ||
Street | 731번지 | ||
Extension | |||
PO Box | |||
Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
City - State/Province - Postal Code | |||
---|---|---|---|
Description | The City - State/Province - Postal Code parse definition parses last line address information into a set of tokens. | ||
Output Tokens | Postal Code Metropol City/County/District Neighborhood Village Island |
||
Example | Input | Output Token | Output |
(138-240) 서울 송파구 신천동 | Postal Code | (138-240) | |
Metropol | 서울 | ||
City/County/District | 송파구 | ||
Neighborhood | 신천동 | ||
Village | |||
Island | |||
Remarks |
E-mail (Multiple Address) | |||
---|---|---|---|
Description | The E-mail (Multiple Address) parse definition parses e-mail addresses that contain one or two addresses into a set of tokens. | ||
Output Tokens | E-mail 1 E-mail 2 |
||
Example 1 | Input | Output Token | Output |
hur_iee@hanmail.net/hur_iee@nhimc.or.kr | E-mail 1 | hur_iee@hanmail.net | |
E-mail 2 | hur_iee@nhimc.or.kr | ||
Example 2 | Input | Output Token | Output |
hush1215@dreamwiz.com | E-mail 1 | hush1215@dreamwiz.com | |
E-mail 2 | |||
Remarks |
Name | |||
---|---|---|---|
Description | The Name parse definition parses names of individuals into a set of tokens. | ||
Output Tokens | Family Name Given Name Suffix Title/Additional Info |
||
Example 1 | Input | Output Token | Output |
홍길동 | Family Name | 홍 | |
Given Name | 길동 | ||
Suffix | |||
Title/Additional Info | |||
Example 2 | Input | Output Token | Output |
홍길동귀하 | Family Name | 홍 | |
Given Name | 길동 | ||
Suffix | 귀하 | ||
Title/Additional Info | |||
Remarks |
Name (Global) | |||
---|---|---|---|
Description | The Name (Global) parse definition parses names of individuals into a globally recognized set of tokens. | ||
Output Tokens | Prefix Given Name Middle Name Family Name Suffix Title/Additional Info |
||
Example 1 | Input | Output Token | Output |
홍길동귀하 | Prefix | ||
Given Name | 길동 | ||
Middle Name | |||
Family Name | 홍 | ||
Suffix | 귀하 | ||
Title/Additional Info | |||
Example 2 | Input | Output Token | Output |
홍길동대리 | Prefix | ||
Given Name | 길동 | ||
Middle Name | |||
Family Name | 홍 | ||
Suffix | |||
Title/Additional Info | 대리 | ||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
Organization | |||
---|---|---|---|
Description | The Organization parse definition parses organization names into a set of tokens. | ||
Output Tokens | Legal Form Name Site Additional Info |
||
Example | Input | Output Token | Output |
(주)한진중공업(건설) | Legal Form | (주) | |
Name | 한진중공업 | ||
Site | |||
Additional Info | 건설 | ||
Remarks |
Organization (Two Organization) | |||
---|---|---|---|
Description | The Organization (Two Organization) parse definition parses strings that contain the names of one or two organizations. | ||
Output Tokens | Org1 Org2 |
||
Example 1 | Input | Output Token | Output |
롯데쇼핑(주)롯데마트 | Org 1 | 롯데쇼핑 | |
Org 2 | (주)롯데마트 | ||
Example 2 | Input | Output Token | Output |
(주)중소기업은행 | Org1 | (주)중소기업은행 | |
Org2 | |||
Remarks |
Phone | |||
---|---|---|---|
Description | The Phone parse definition parses phone numbers into a set of tokens. | ||
Output Tokens | Prefix Country Code Area Code Exchange Station Extension ID Extension |
||
Example 1 | Input | Output Token | Output |
0234895691 | Prefix | ||
Country Code | |||
Area Code | 02 | ||
Exchange | 3489 | ||
Station | 5691 | ||
Extension ID | |||
Extension | |||
Example 2 | Input | Output Token | Output |
사031-781-4523 | Prefix | 사 | |
Country Code | |||
Area Code | 031 | ||
Exchange | 781 | ||
Station | 4523 | ||
Extension ID | |||
Extension | |||
Remarks |
Phone (Global) | |||
---|---|---|---|
Description | The Phone (Global) parse definition parses phone numbers into a globally recognized set of tokens. | ||
Output Tokens | Country Code Area Code Base Number Extension Line Type Additional Info |
||
Example 1 | Input | Output Token | Output |
(02) 3459-8077 | Country Code | ||
Area Code | 02 | ||
Base Number | 3459-8077 | ||
Extension | |||
Line Type | |||
Additional Info | |||
Example 2 | Input | Output Token | Output |
043838 8343 | Country Code | ||
Area Code | 043 | ||
Base Number | 838 8343 | ||
Extension | |||
Line Type | |||
Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
Social Security Number | |||
---|---|---|---|
Description | The Social Security Number parse definition parses Social Security Numbers (SSN/NIR) into a set of tokens. | ||
Output Tokens | Birthdate ID Number |
||
Example 1 | Input | Output Token | Output |
070614-3234563 | Birthdate | 070614 | |
ID Number | 3234563 | ||
Example 2 | Input | Output Token | Output |
7711142123458 | Birthdate | 771114 | |
ID Number | 2123458 | ||
Remarks |
None.
Address | ||
---|---|---|
Description | The Address standardization definition standardizes addresses. | |
Examples | Input | Output |
1145 우장산롯데@ 306/1103 | 1145번지 우장산롯데아파트 306-1103 | |
1446-11현대슈퍼빌 아 a동 705호 | 1446-11번지 현대슈퍼빌아파트 A-705 | |
Remarks |
Address (Full) | ||
---|---|---|
Description | The Address (Full) standardization definition standardizes complete two line addresses. | |
Example | Input | Output |
서울 강남구 대치동1014-3 삼성 아 301동 402호 | 서울시 강남구 대치동 1014-3번지 삼성아파트 301-402 | |
Remarks |
City | ||
---|---|---|
Description | The City standardization definition standardizes city names. | |
Examples | Input | Output |
서울특별시 | 서울시 | |
광주 | 광주시 | |
Remarks |
City - State/Province - Postal Code | ||
---|---|---|
Description | The City - State/Province - Postal Code standardization definition standardizes last line address information. | |
Examples | Input | Output |
서울 송파구 방이동 | 서울시 송파구 방이동 | |
463-070경기 성남 분당구 야탑동 | (463-070) 경기도 성남시 분당구 야탑동 | |
Remarks |
Name | ||
---|---|---|
Description | The Name standardization definition standardizes names of individuals. | |
Examples | Input | Output |
한 국 | 한국 | |
홍길동대리 | 홍길동 | |
Remarks |
Organization | ||
---|---|---|
Description | The Organization standardization definition standardizes organization names. | |
Examples | Input | Output |
주)스타벅스코리아 | 스타벅스코리아(주) | |
엘지텔레콤 | LG텔레콤 | |
Remarks |
Phone | ||
---|---|---|
Description | The Phone standardization definition standardizes phone numbers for domestic use. | |
Examples | Input | Output |
(02) 3459-8077 | 02-3459-8077 | |
011214 0075 | 011-214-0075 | |
Remarks |
Postal Code | ||
---|---|---|
Description | The Postal Code standardization definition standardizes postal codes. | |
Example | Input | Output |
139205 | 139-205 | |
Remarks |
Social Security Number | ||
---|---|---|
Description | The Social Security Number standardization definition standardizes Social Security Numbers. | |
Example | Input | Output |
7711142123458 | 771114-2123458 | |
Remarks |
In addition to the definitions listed on this page, the Korean, South Korea locale also inherits all definitions for the Korean language and all Global definitions.
Documentation Feedback: yourturn@sas.com |
Doc ID: QKBCI_KOKOR_defs.html |