SAS Quality Knowledge Base for Contact Information 25
Definitions for the Korean, South Korea locale are described below.
Case Definitions
Gender Analysis Definitions
Identification Analysis Definitions
Match Definitions
Parse Definitions
Pattern Analysis Definitions
Standardization Definitions
Inherited Definitions
Proper (Name) | ||
---|---|---|
Description | The Case definition for Proper (Name) propercases names rendered in Latin. | |
Examples | Input | Output |
ALFRED VON KNORRING | Alfred von Knorring | |
james mcdonald | James McDonald | |
Remarks |
Proper (Organization) | ||
---|---|---|
Description | The Case definition for Proper (Organization) propercases organization names rendered in Latin. | |
Examples | Input | Output |
JOHNSON and JOHNSON | Johnson and Johnson | |
sas institute korea | SAS Institute Korea | |
Remarks |
Social Security Number | ||
---|---|---|
Description | The Gender Analysis definition for Social Security Number determines an individual's gender based on a social security number. | |
Possible Outputs | M F U |
|
Examples | Input | Output |
070614-3234563 | M | |
771114-2123458 | F | |
800101-5123456 | U | |
Remarks |
Description | The Identification Analysis definition for E-mail determines whether an e-mail address is valid or not. | |
Possible Outputs | VALID INVALID |
|
Examples | Input | Output |
jqqa5112@hanmail.com | INVALID | |
jes7657@korea.com | VALID | |
Remarks |
Social Security Number | ||
---|---|---|
Description | The Identification Analysis definition for Social Security Number determines an individual's nationality based on a social security number. | |
Possible Outputs | FOREIGN KOREAN UNKNOWN |
|
Examples | Input | Output |
070614-3234563 | KOREAN | |
790220-6106122 | FOREIGN | |
680914-5088712 | UNKNOWN | |
Remarks |
Address | ||
---|---|---|
Description | The Address match definition generates match codes which can be used to cluster records containing addresses. | |
Max Length of Match Code | 47 characters | |
Examples | Input | Cluster ID |
1120번지 우미파크빌 505동802호 | 0 | |
1120 우미파크빌 505-802 | 0 | |
1135번지 옥빛마을@ 1712-304호 | 1 | |
1135 옥빛마을아 1712-304 | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Address (Apartment Only) | ||
---|---|---|
Description | The Address (Apartment Only) match definition generates match codes which can be used to cluster records containing apartment information. | |
Max Length of Match Code | 47 characters | |
Examples | Input | Cluster ID |
107-44 우민아파트 608호 | 0 | |
우민아.608호 | 0 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
|
The Match definition for Address (Apartment Only) ignores street and PO Box information. |
Address (Full) | ||
---|---|---|
Description | The Address (Full) match definition generates match codes which can be used to cluster records containing complete two-line addresses. | |
Max Length of Match Code | 60 characters | |
Examples | Input | Cluster ID |
서울특별시 강남구 역삼동 701-2번지 삼정개발빌딩 17층 | 0 | |
(135-080) 서울 강남구 역삼동 701-2 삼정개발빌딩 17층 | 0 | |
인천 남동구 남촌동 610-9 남동 공단 36블록 10롯트 | 1 | |
(405-846) 인천 남동구 남촌동 610-9 36블럭 10로트 | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Address (Full) (Apartment Only) | ||
---|---|---|
Description | The Address (Full) (Apartment Only) match definition generates match codes which can be used to cluster records containing complete two-line addresses that include apartment information. | |
Max Length of Match Code | 60 characters | |
Examples | Input | Cluster ID |
서울 중구 태평로2가 250 삼성본관 18층 | 0 | |
서울중구태평로2가삼성본관빌딩 18층 | 0 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
|
The Match definition for Address (Full) (Apartment Only) ignores street and PO Box information. |
Address (Full) (PO Box Only) | ||
---|---|---|
Description | The Address (Full) (PO Box Only) match definition generates match codes which can be used to cluster records containing complete two-line addresses that include PO Box information. | |
Max Length of Match Code | 35 characters | |
Examples | Input | Cluster ID |
경기도 성남시 수정구 창곡동 사서함 122-11호 | 0 | |
경기도 성남시 수정구 창곡동 우체국 사서함 122-11호 | 0 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
|
This match definition ignores street and building information. |
Address (Full) (Street Only) | ||
---|---|---|
Description | The Address (Full) (Street Only) match definition generates match codes which can be used to cluster records containing complete two-line addresses that include street information. | |
Max Length of Match Code | 60 characters | |
Examples | Input | Cluster ID |
서울특별시 강남구 역삼동 701-2번지 삼정개발빌딩 17층 | 0 | |
(135-080) 서울 강남구 역삼동 701-2 삼정개발빌딩 17층 | 0 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
|
This match definition ignores building and PO Box information. |
Address (PO Box Only) | ||
---|---|---|
Description | The Address (PO Box Only) match definition generates match codes which can be used to cluster records containing the PO Box portion of an address. | |
Max Length of Match Code | 22 characters | |
Examples | Input | Cluster ID |
사서함 122-11호 | 0 | |
유성우체국 사서함 35호 | 1 | |
사서함 35호 | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
|
This match definition ignores street and building information. |
Address (Street Only) | ||
---|---|---|
Description | The Address (Street Only) match definition generates match codes which can be used to cluster records containing the street portion of an address. | |
Max Length of Match Code | 47 characters | |
Examples | Input | Cluster ID |
101-1 우성아파트 12-1002 | 0 | |
101-1번지 우성아 12-1002 | 0 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
|
This match definition ignores building and PO Box information. |
City | ||
---|---|---|
Description | The City match definition generates match codes which can be used to cluster records containing city names. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
고양시 덕양구 | 0 | |
광주 | 1 | |
광주시 | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
City - State/Province - Postal Code | ||
---|---|---|
Description | The City - State/Province - Postal Code match definition generates match codes which can be used to cluster records containing last line address information. | |
Max Length of Match Code | 28 characters | |
Examples | Input | Cluster ID |
서울특별시 송파구 신천동 | 0 | |
(138-728) 서울 송파구 신천동 | 0 | |
138-240 서울시 송파구 신천동 | 0 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Name | ||
---|---|---|
Description | The Name match definition generates match codes which can be used to cluster records containing names of individuals. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
조재용 | 0 | |
조제용 | 0 | |
조재용 귀하 | 0 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Organization | ||
---|---|---|
Description | The Organization match definition generates match codes which can be used to cluster records containing organization names. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
주식회사현대백화점 | 0 | |
현대백화점(HDSI) | 0 | |
LGT | 1 | |
(주)LG텔레콤 | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Phone | ||
---|---|---|
Description | The Phone match definition generates match codes which can be used to cluster records containing phone numbers. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
011-9143-0614 | 0 | |
01191430614 | 0 | |
(02)26136711 | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Postal Code | ||
---|---|---|
Description | The Postal Code match definition generates match codes which can be used to cluster records containing postal codes. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
(135-940) | 0 | |
135940 | 0 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Social Security Number | ||
---|---|---|
Description | The Social Security Number match definition generates match codes which can be used to cluster records containing Social Security Numbers. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
070614-3234563 | 0 | |
0706143234563 | 0 | |
771114-2123458 | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Address | |||
---|---|---|---|
Description | The Parse definition for Address parses addresses. | ||
Output Tokens | Neighborhood Village Island Street Number Block/Lot Building/Apartment Name Building/Apartment Number Extension Additional Info |
||
Example 1 | Input | Output | |
40 갤러리아 팰리스 B동 2503 | Neighborhood | ||
Village | |||
Island | |||
Street Number | 40 | ||
Block/Lot | |||
Building/Apartment Name | 갤러리아 팰리스 | ||
Building/Apartment Number | B동 2503 | ||
Extension | |||
Additional Info | |||
Example 2 | Input | Output | |
158번지 두진빌딩4층 (주)P&M | Neighborhood | ||
Village | |||
Island | |||
Street Number | 158번지 | ||
Block/Lot | |||
Building/Apartment Name | 두진빌딩 | ||
Building/Apartment Number | |||
Extension | 4층 | ||
Additional Info | (주)P&M | ||
Remarks |
Address (Full) | |||
---|---|---|---|
Description | The Parse definition for Address (Full) parses complete two-line addresses. | ||
Output Tokens | Postal Code Metropol City/County/District Neighborhood Village Island Street Number Block/Lot Building/Apartment Name Building/Apartment Number Extension Additional Info |
||
Example 1 | Input | Output | |
(135-940) 서울 강남구 개포동 대청아파트 301-810 | Postal Code | (135-940) | |
Metropol | 서울 | ||
City/County/District | 강남구 | ||
Neighborhood | 개포동 | ||
Village | |||
Island | |||
Street Number | |||
Block/Lot | |||
Building/Apartment Name | 대청아파트 | ||
Building/Apartment Number | 301-810 | ||
Extension | |||
Additional Info | |||
Example 2 | Input | Output | |
경기 성남시 분당 정자동29번지 경남빌라 104-201 | Postal Code | ||
Metropol | 경기 | ||
City/County/District | 성남시 분당 | ||
Neighborhood | 정자동 | ||
Village | |||
Island | |||
Street Number | 29번지 | ||
Block/Lot | |||
Building/Apartment Name | 경남빌라 | ||
Building/Apartment Number | 104-201 | ||
Extension | |||
Additional Info | |||
Remarks |
Address (Global) | |||
---|---|---|---|
Description |
The Address (Global) parse definition parses addresses into a globally recognized set of tokens. |
||
Output Tokens | Recipient Building/Site Street Extension PO Box Additional Info |
||
Example | Input | Output | |
731번지 한솔아파트 301-201 | Recipient | ||
Building/Site | 한솔아파트 301-201 | ||
Street | 731번지 | ||
Extension | |||
PO Box | |||
Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. | ||
The Address (Global) (v23) parse definition is now deprecated and will be removed in a future release of the QKB. The Address (Global) parse definition has been replaced with a copy of the Address (Global) (v23) definition which takes advantage of the new tokens and updated processing. If you changed your jobs to use Address (Global) (v23) it is suggested that you change them back. |
Address (Global) (v23) | |||
---|---|---|---|
Description |
The Address (Global) (v23) parse definition parses addresses into a globally recognized set of tokens. |
||
Output Tokens | Recipient Building/Site Street Extension PO Box Additional Info |
||
Example | Input | Output | |
731번지 한솔아파트 301-201 | Recipient | ||
Building/Site | 한솔아파트 301-201 | ||
Street | 731번지 | ||
Extension | |||
PO Box | |||
Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. | ||
The Address (Global) (v23) parse definition is now deprecated and will be removed in a future release of the QKB. The Address (Global) parse definition has been replaced with a copy of the Address (Global) (v23) definition which takes advantage of the new tokens and updated processing. If you changed your jobs to use Address (Global) (v23) it is suggested that you change them back. |
City - State/Province - Postal Code | |||
---|---|---|---|
Description | The Parse definition for City - State/Province - Postal Code parses address "first line" data. | ||
Output Tokens | Postal Code Metropol City/County/District Neighborhood Village Island |
||
Example | Input | Output | |
(138-240) 서울 송파구 신천동 | Postal Code | (138-240) | |
Metropol | 서울 | ||
City/County/District | 송파구 | ||
Neighborhood | 신천동 | ||
Village | |||
Island | |||
Remarks |
E-mail (Multiple Address) | |||
---|---|---|---|
Description | The Parse definition for E-mail (Multiple Address) parses e-mail addresses that contain one or two addresses. | ||
Output Tokens | E-mail 1 E-mail 2 |
||
Example 1 | Input | Output | |
hur_iee@hanmail.net/hur_iee@nhimc.or.kr | E-mail 1 | hur_iee@hanmail.net | |
E-mail 2 | hur_iee@nhimc.or.kr | ||
Example 2 | Input | Output | |
hush1215@dreamwiz.com | E-mail 1 | hush1215@dreamwiz.com | |
E-mail 2 | |||
Remarks |
Name | |||
---|---|---|---|
Description | The Parse definition for Name parses names of individuals. | ||
Output Tokens | Family Name Given Name Suffix Title/Additional Info |
||
Example 1 | Input | Output | |
홍길동 | Family Name | 홍 | |
Given Name | 길동 | ||
Suffix | |||
Title/Additional Info | |||
Example 2 | Input | Output | |
홍길동귀하 | Family Name | 홍 | |
Given Name | 길동 | ||
Suffix | 귀하 | ||
Title/Additional Info | |||
Remarks |
Name (Global) | |||
---|---|---|---|
Description | The Parse definition for Name (Global) parses names of individuals into a globally recognized set of tokens. | ||
Output Tokens | Prefix Given Name Middle Name Family Name Suffix Title/Additional Info |
||
Example 1 | Input | Output | |
홍길동귀하 | Prefix | ||
Given Name | 길동 | ||
Middle Name | |||
Family Name | 홍 | ||
Suffix | 귀하 | ||
Title/Additional Info | |||
Example 2 | Input | Output | |
홍길동대리 | Prefix | ||
Given Name | 길동 | ||
Middle Name | |||
Family Name | 홍 | ||
Suffix | |||
Title/Additional Info | 대리 | ||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
Organization | |||
---|---|---|---|
Description | The Parse definition for Organization parses organization names. | ||
Output Tokens | Legal Form Name Site Additional Info |
||
Example | Input | Output | |
(주)한진중공업(건설) | Legal Form | (주) | |
Name | 한진중공업 | ||
Site | |||
Additional Info | 건설 | ||
Remarks |
Organization (Two Organization) | |||
---|---|---|---|
Description | The Parse definition for Organization (Two Organization) parses strings that contain the names of one or two organizations. | ||
Output Tokens | Org1 Org2 |
||
Example 1 | Input | Output | |
롯데쇼핑(주)롯데마트 | Org 1 | 롯데쇼핑 | |
Org 2 | (주)롯데마트 | ||
Example 2 | Input | Output | |
(주)중소기업은행 | Org1 | (주)중소기업은행 | |
Org2 | |||
Remarks |
Phone | |||
---|---|---|---|
Description | The Parse definition for Phone parses Korean phone numbers. | ||
Output Tokens | Prefix Country Code Area Code Exchange Station Extension ID Extension |
||
Example 1 | Input | Output | |
0234895691 | Prefix | ||
Country Code | |||
Area Code | 02 | ||
Exchange | 3489 | ||
Station | 5691 | ||
Extension ID | |||
Extension | |||
Example 2 | Input | Output | |
사031-781-4523 | Prefix | 사 | |
Country Code | |||
Area Code | 031 | ||
Exchange | 781 | ||
Station | 4523 | ||
Extension ID | |||
Extension | |||
Remarks |
Phone (Global) | |||
---|---|---|---|
Description | The Parse definition for Phone (Global) parses phone numbers into a globally recognized set of tokens. | ||
Output Tokens | Country Code Area Code Base Number Extension Line Type Additional Info |
||
Example 1 | Input | Output | |
(02) 3459-8077 | Country Code | ||
Area Code | 02 | ||
Base Number | 3459-8077 | ||
Extension | |||
Line Type | |||
Additional Info | |||
Example 2 | Input | Output | |
043838 8343 | Country Code | ||
Area Code | 043 | ||
Base Number | 838 8343 | ||
Extension | |||
Line Type | |||
Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
Social Security Number | |||
---|---|---|---|
Description | The Parse definition for Social Security Number parses social security numbers. | ||
Output Tokens | Birthdate ID Number |
||
Example 1 | Input | Output | |
070614-3234563 | Birthdate | 070614 | |
ID Number | 3234563 | ||
Example 2 | Input | Output | |
7711142123458 | Birthdate | 771114 | |
ID Number | 2123458 | ||
Remarks |
None.
Address | ||
---|---|---|
Description | The Standardization definition for Address standardizes address data. | |
Examples | Input | Output |
1145 우장산롯데@ 306/1103 | 1145번지 우장산롯데아파트 306-1103 | |
1446-11현대슈퍼빌 아 a동 705호 | 1446-11번지 현대슈퍼빌아파트 A-705 | |
Remarks |
Address (Full) | ||
---|---|---|
Description | The Standardization definition for Address (Full) standardizes completed two-line address data. | |
Example | Input | Output |
서울 강남구 대치동1014-3 삼성 아 301동 402호 | 서울시 강남구 대치동 1014-3번지 삼성아파트 301-402 | |
Remarks |
City | ||
---|---|---|
Description | The Standardization definition for City standardizes city names. | |
Examples | Input | Output |
서울특별시 | 서울시 | |
광주 | 광주시 | |
Remarks |
City - State/Province - Postal Code | ||
---|---|---|
Description | The Standardization definition for City - State/Province - Postal Code standardizes address "last line" data. | |
Examples | Input | Output |
서울 송파구 방이동 | 서울시 송파구 방이동 | |
463-070경기 성남 분당구 야탑동 | (463-070) 경기도 성남시 분당구 야탑동 | |
Remarks |
Name | ||
---|---|---|
Description | The Standardization definition for Name standardizes names of individuals. | |
Examples | Input | Output |
한 국 | 한국 | |
홍길동대리 | 홍길동 | |
Remarks |
Organization | ||
---|---|---|
Description | The Standardization definition for Organization standardizes organization names. | |
Examples | Input | Output |
주)스타벅스코리아 | 스타벅스코리아(주) | |
엘지텔레콤 | LG텔레콤 | |
Remarks |
Phone | ||
---|---|---|
Description | The Standardization definition for Phone standardizes Korean phone numbers. | |
Examples | Input | Output |
(02) 3459-8077 | 02-3459-8077 | |
011214 0075 | 011-214-0075 | |
Remarks |
Postal Code | ||
---|---|---|
Description | The Standardization definition for Postal Code standardizes postal codes. | |
Example | Input | Output |
139205 | 139-205 | |
Remarks |
Social Security Number | ||
---|---|---|
Description | The Standardization definition for Social Security Number standardizes social security numbers. | |
Example | Input | Output |
7711142123458 | 771114-2123458 | |
Remarks |
In addition to the definitions listed on this page, the Korean, South Korea locale also inherits all definitions for the Korean language and all Global definitions.
Documentation Feedback: yourturn@sas.com
|
Doc ID: QKBCI_KOKOR_defs.html |