SAS Quality Knowledge Base for Contact Information 25
Definitions for the Polish, Poland locale are described below.
Case Definitions
Gender Analysis Definitions
Identification Analysis Definitions
Match Definitions
Parse Definitions
Pattern Analysis Definitions
Standardization Definitions
Inherited Definitions
Lower (Phone) | ||
---|---|---|
Description | Lowercases text in phone data strings. | |
Example | Input | Output |
08013BBANK | 08013bbank | |
6327958 SOLARIUM | 6327958 solarium | |
7178455 dom | 7178455 dom | |
Remarks |
Proper (Address) | ||
---|---|---|
Description | Propercases street names in address street name token. | |
Examples | Input | Output |
Pl Orlat Lwowskich 1 | Pl Orlat Lwowskich 1 | |
ul Wladyslawa IV 1 | ul Wladyslawa IV 1 | |
Remarks |
Proper (City - State/Province - Postal Code) | ||
---|---|---|
Description | Propercases address "last line" data. | |
Examples | Input | Output |
ZLOTORYJA 59-500 | Zlotoryja 59-500 | |
ZYWIEC 34-300 | Zywiec 34-300 | |
Remarks |
Proper (Name) | ||
---|---|---|
Description | Propercases names. | |
Examples | Input | Output |
JAN SZYMON VON DEKER | Jan Szymon von Deker | |
k kowalski | K Kowalski | |
Remarks |
Proper (Organization) | ||
---|---|---|
Description | Propercases organization names. | |
Examples | Input | Output |
gtc rail poland SP. z o.o. | GTC Rail Poland sp. z o.o. | |
XXVIII LICEUM IM. JANA NOWAKA | XXVIII Liceum im. Jana Nowaka | |
Remarks |
Name | ||
---|---|---|
Description | Determines an individual's gender based on a name. | |
Possible Outputs | M, F, U | |
Examples | Input | Output |
Beata Krawczak | F | |
Marcin Giebultowicz | M | |
T. Soszynski | M | |
T. Soszyńska | F | |
Dziecko | U | |
Remarks |
Individual/Organization | ||
---|---|---|
Description | Determines whether a string represents the name of an individual or an organization. | |
Possible Outputs | INDIVIDUAL, ORGANIZATION, UNKNOWN | |
Examples | Input | Output |
Action Sp. z o.o. | ORGANIZATION | |
Janusz Wojdecki | INDIVIDUAL | |
Telekomunikacja Polska Spółka Akcyjna | ORGANIZATION | |
Action Budniak | UNKNOWN | |
Remarks |
Name (Single/Multiple) | ||
---|---|---|
Description | Determines whether a string represents the name of one person or more than one person. | |
Possible Outputs | Single, Multiple | |
Examples | Input | Output |
DEZYDERAT I WYSZESLAW PONIKWIA | Multiple | |
Wyszeslaw Ponikwia | Single | |
KORDIAN GAJEWSKI I HORACY TYRKIEL | Multiple | |
Remarks |
Address | ||
---|---|---|
Description | The Address match definition generates match codes which can be used to cluster records containing addresses. | |
Max Length of Match Code | 16 characters | |
Examples | Input | Cluster ID |
AL. Armii Ludowej 26 | 0 | |
Focus Building Al. Armii Ludowej 26 | 0 | |
ul Dekoracyjna 3 | 1 | |
Dekoracyjna 3 | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
City | ||
---|---|---|
Description | The City match definition generates match codes which can be used to cluster records containing city names. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
Kłodzka | 0 | |
Kłodska | 0 | |
Zielonka | 1 | |
Zialonka | 1 | |
Zielonk | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
City - State/Province - Postal Code | ||
---|---|---|
Description | The City - State/Province - Postal Code match definition generates match codes which can be used to cluster records containing last line address information. | |
Max Length of Match Code | 77 characters | |
Examples | Input | Cluster ID |
00-925 Warszawa, Mazowieckie | 0 | |
44-100 Gliwice, Śląskie | 1 | |
44-100 GLIWICE WOJ. SLASKIE | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Country | ||
---|---|---|
Description | The Country match definition generates match codes which can be used to cluster records containing country names. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
POLSKA | 0 | |
WARSZAWA | 0 | |
PL | 0 | |
BELGIA | 1 | |
BELGIUM | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Name | ||
---|---|---|
Description | The Name match definition generates match codes which can be used to cluster records containing names of individuals. | |
Max Length of Match Code | 22 characters | |
Examples | Input | Cluster ID |
Jerzy Topolski | 0 | |
Pan Jerzy Topolski | 0 | |
Pan Jerzy Cieśliński | 1 | |
Prof. Janusz Siwy | 2 | |
Pan Janusz Sawa | 2 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Organization | ||
---|---|---|
Description | The Organization match definition generates match codes which can be used to cluster records containing organization names. | |
Max Length of Match Code | 35 characters | |
Examples | Input | Cluster ID |
Przedsiębiorstwo Usługowo-Handlowe TOMAI Sp. z o.o. | 0 | |
Przedsiębiorstwo Handlowe TOMI | 0 | |
Polskie Wydawnictwa Profesjonalne | 1 | |
Polskie Wydawnictwa Profesjonalne Sp. z o.o. KiK Konieczny | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Phone | ||
---|---|---|
Description | The Phone match definition generates match codes which can be used to cluster records containing phone numbers. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
815-67-78 | 0 | |
8156778 | 0 | |
kom. 8833377 | 1 | |
tel.komor. 8833377 | 1 | |
służ. 888867875 | 2 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Postal Code | ||
---|---|---|
Description | The Postal Code match definition generates match codes which can be used to cluster records containing postal codes. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
44-100 | 0 | |
44100 | 0 | |
00-925 | 1 | |
00 925 | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Text | ||
---|---|---|
Description | The Text match definition generates match codes which can be used to cluster records containing general text strings. | |
Max Length of Match Code | 20 characters | |
Examples | Input | Cluster ID |
Data Management Studio | 0 | |
Przewodniczący | 1 | |
Przewodnicząca | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Address | |||
---|---|---|---|
Description | Parses addresses into a set of tokens. | ||
Output Tokens | Street Type
Street Name Building Number Extension Additional Info |
||
Example 1 | Input | Output | |
ul. Gdańska 27/31 nr. 4, III piętro | Street Type | ul. | |
Street Name | Gdańska | ||
Building Number | 27/31 | ||
Extension | nr. 4 | ||
Additional Info | III piętro | ||
Example 2 | Input | Output | |
os. Gdańskie 15, blok 1 | Street Type | os. | |
Street Name | Gdańskie | ||
Building Number | 15 | ||
Extension | |||
Additional Info | blok 1 | ||
Remarks |
Address (Global) | |||
---|---|---|---|
Description |
The Address (Global) parse definition parses addresses into a globally recognized set of tokens. |
||
Output Tokens | Recipient Building/Site Street Extension PO Box Additional Info |
||
Example 1 | Input | Output | |
ul. Gdańska 27/31 nr. 4, III piętro | Recipient | ||
Building/Site | |||
Street | ul. Gdańska 27/31 | ||
Extension | nr. 4 | ||
PO Box | |||
Additional Info | III piętro | ||
Example 2 | Input | Output | |
os. Gdańskie 15, blok 1 | Recipient | ||
Building/Site | |||
Street | os. Gdańskie 15 | ||
Extension | |||
PO Box | |||
Additional Info | blok 1 | ||
Remarks |
Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
||
The Address (Global) (v23) parse definition is now deprecated and will be removed in a future release of the QKB. The Address (Global) parse definition has been replaced with a copy of the Address (Global) (v23) definition which takes advantage of the new tokens and updated processing. If you changed your jobs to use Address (Global) (v23) it is suggested that you change them back. |
Address (Global) (v23) | |||
---|---|---|---|
Description | The Address (Global) (v23) parse definition parses addresses into a globally recognized set of tokens. | ||
Output Tokens | Recipient Building/Site Street Extension PO Box Additional Info |
||
Example 1 | Input | Output | |
ul. Gdańska 27/31 nr. 4, III piętro | Recipient | ||
Building/Site | |||
Street | ul. Gdańska 27/31 | ||
Extension | nr. 4 | ||
PO Box | |||
Additional Info | III piętro | ||
Example 2 | Input | Output | |
os. Gdańskie 15, blok 1 | Recipient | ||
Building/Site | |||
Street | os. Gdańskie 15 | ||
Extension | |||
PO Box | |||
Additional Info | blok 1 | ||
Remarks |
Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
||
The Address (Global) (v23) parse definition is now deprecated and will be removed in a future release of the QKB. The Address (Global) parse definition has been replaced with a copy of the Address (Global) (v23) definition which takes advantage of the new tokens and updated processing. If you changed your jobs to use Address (Global) (v23) it is suggested that you change them back. |
City - State/Province - Postal Code | |||
---|---|---|---|
Description | Parses address "last line" data into a set of tokens. | ||
Output Tokens | Postal Code City Neighboring City Commune Region Province |
||
Example 1 | Input | Output | |
34 550 Olkusz k/Częstochowy, gm. Gąbin | Postal Code | 34550 | |
City | Olkusz | ||
Neighboring City | Częstochowy | ||
Commune | Gąbin | ||
Region | |||
Province | |||
Example 2 | Input | Output | |
44-100 Knurów, Gliwicki, woj. Śląskie |
Postal Code | 44-100 | |
City | Knurów | ||
Neighboring City | |||
Commune | |||
Region | Gliwicki | ||
Province | Śląskie | ||
Remarks |
City - State/Province - Postal Code (Global) | |||
---|---|---|---|
Description | Parses address "last line" data into a globally recognized set of tokens. | ||
Output Tokens | City State/Province Postal Code Additional Info |
||
Example 1 | Input | Output | |
34 550 Olkusz k/Częstochowy, gm. Gąbin | City | Olkusz k/Częstochowy | |
State/Province | gm. Gąbin | ||
Postal Code | 34550 | ||
Additional Info | |||
Example 2 | Input | Output | |
44-100 Knurów, Gliwicki, woj. Śląskie | City | Knurów, | |
State/Province | Gliwicki, woj. Śląskie | ||
Postal Code | 44-100 | ||
Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
Name | |||
---|---|---|---|
Description | Parses names of individuals into a set of tokens. | ||
Output Tokens | Prefix/Title Given Name Middle Name Family Name Suffix |
||
Example 1 | Input | Output | |
Pan Prof. Wojciech M. Kulik | Prefix/Title | Pan Prof. | |
Given Name | Wojciech | ||
Middle Name | M. | ||
Family Name | Kulik | ||
Suffix | |||
Example 2 | Input | Output | |
ks. prow. Jan Szymon von Deker II | Prefix/Title | ks. prow. | |
Given Name | Jan | ||
Middle Name | Szymon | ||
Family Name | von Deker | ||
Suffix | II | ||
Remarks |
Name (Global) | |||
---|---|---|---|
Description | Parses names of individuals into a globally recognized set of tokens. | ||
Output Tokens | Prefix Given Name Middle Name Family Name Suffix Title/Additional Info |
||
Example 1 | Input | Output | |
Pan Prof. Wojciech M. Kulik | Prefix | Pan Prof. | |
Given Name | Wojciech | ||
Middle Name | M. | ||
Family Name | Kulik | ||
Suffix | |||
Title/Additional Info | |||
Example 2 | Input | Output | |
dr Janina Nowak-Kowalska | Prefix/Title | dr | |
Given Name | Janina | ||
Middle Name | |||
Family Name | Nowak-Kowalska | ||
Suffix | |||
Title/Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
Name (Multiple Name) | |||
---|---|---|---|
Description | Parses strings that contain the names of two individuals into a set of tokens. | ||
Output Tokens | Name 1 Name 2 |
||
Examples | Input | Output | |
1 | Łukasz Leszewski i Magdalena Pawłowska | Name 1 | Łukasz Leszewski |
Name 2 | Magdalena Pawłowska | ||
2 | Katarzyna Potocka | Name 1 | Katarzyna Potocka |
Name 2 | |||
3* | Łukasz Leszewski i Magdalena Leszewska | Name 1 | Łukasz Leszewski |
Name 2 | Magdalena Leszewska | ||
Remarks | If only one name is present in the input, the first token is used. Because Polish family names use feminine, masculine, and plural variations, strings containing multiple names should be standardized with the Name (Multiple Name) standardization definition before being processed with the Name (Multiple Name) parse definition. Otherwise, the results of the parse may show incorrect family name variations for individual names. * In Example 3, the input is the output of Example 5 of the Name (Multiple Name) standardization definition. The original input was Łukasz i Magdalena Leszewscy. |
Organization | |||
---|---|---|---|
Description | Parses organization names into a set of tokens. | ||
Output Tokens | Name Legal Form Site Additional Info |
||
Example 1 | Input | Output | |
SAS Sp. z o.o. BUFIN oddział w Warszawie | Name | SAS | |
Legal Form | Sp. z o.o. | ||
Site | oddział w Warszawie | ||
Additional Info | BUFIN | ||
Example 2 | Input | Output | |
DataFlux sc w Górze Kalwarii | Name | DataFlux | |
Legal Form | sc | ||
Site | w Górze Kalwarii | ||
Additional Info | |||
Remarks |
Organization (Global) | |||
---|---|---|---|
Description | Parses organization names into a globally recognized set of tokens. | ||
Output Tokens | Name Legal Form Site Additional Info |
||
Example 1 | Input | Output | |
SAS Sp. z o.o. BUFIN oddział w Warszawie | Name | SAS | |
Legal Form | Sp. z o.o. | ||
Site | oddział w Warszawie | ||
Additional Info | BUFIN | ||
Example 2 | Input | Output | |
DataFlux sc w Górze Kalwarii | Name | DataFlux | |
Legal Form | sc | ||
Site | w Górze Kalwarii | ||
Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
Phone | |||
---|---|---|---|
Description | Parses Polish phone numbers into a set of tokens. | ||
Output Tokens | Prefix Country Code Area Code Base Number Extension |
||
Example 1 | Input | Output | |
tel. +48 (22) 1234567 w. 89 | Prefix | tel. | |
Country Code | 48 | ||
Area Code | 22 | ||
Base Number | 1234567 | ||
Extension | w. 89 | ||
Example 2 | Input | Output | |
603584911 | Prefix | ||
Country Code | |||
Area Code | |||
Base Number | 603584911 | ||
Extension | |||
Remarks |
Phone (Global) | |||
---|---|---|---|
Description | Parses phone numbers into a globally recognized set of tokens. | ||
Output Tokens | Country Code Area Code Base Number Extension Line Type Additional Info |
||
Example 1 | Input | Output | |
tel. +48 (22) 1234567 w. 89 | Country Code | 48 | |
Area Code | 22 | ||
Base Number | 1234567 | ||
Extension | w. 89 | ||
Line Type | tel. | ||
Additional Info | |||
Example 2 | Input | Output | |
603584911 | Country Code | ||
Area Code | |||
Base Number | 603584911 | ||
Extension | |||
Line Type | |||
Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
None.
Address | ||
---|---|---|
Description | Standardizes addresses. | |
Example | Input | Output |
ULICA GDAŃSKA 27-31 | ul. Gdańska 27/31 | |
Trabki ul Osadnicza 8 | ul. Osadnicza 8, Trabki | |
ul Chlodna 51 XVI Floor | ul. Chłodna 51, XVI Floor | |
Remarks |
City | ||
---|---|---|
Description | Standardizes city names. | |
Example | Input | Output |
Lubiana k Koscierzny | Łubiana k Koscierzny | |
Grodzisk Mazowiecki | Grodzisk Mazowiecki | |
SRODA SLASKA | Środa Śląska | |
Remarks |
City - State/Province - Postal Code | ||
---|---|---|
Description | Standardizes address "last line" data. | |
Examples | Input | Output |
Pila 64920 | 64-920 Piła | |
Skarzysko-Kamienna 26-110 | 26-110 Skarżysko-Kamienna | |
Wegierska Górka 34-350 | 34-350 Węgierska Górka | |
Remarks |
Country | ||
---|---|---|
Description | Standardizes country data. | |
Examples | Input | Output |
Niemcy | Niemcy | |
PL | Polska | |
NETHERLANDS | Holandia | |
Remarks |
Country (ISO 2 char) | ||
---|---|---|
Description | Standardizes Country data into ISO 2 Char notation. | |
Examples | Input | Output |
Dania | DK | |
CZECHY | CZ | |
LEGIONOWO | PL | |
Remarks |
Name | ||
---|---|---|
Description | Standardizes names of individuals. | |
Examples | Input | Output |
INZYNIER LUKASZ KOWALKOWSKI | inż. Łukasz Kowalkowski | |
bartos, adam | Adam Bartos | |
Remarks |
Name (Multiple Name) | ||
---|---|---|
Description | Standardizes input data that contains two names. | |
Examples | Input | Output |
DEZYDERAT I WYSZESLAW PONIKWIA | Dezyderat Ponikwia i Wyszeslaw Ponikwia | |
CIESZYSLAW I LESLAW NOWINSCY | Cieszyslaw Nowinski i Leslaw Nowinski | |
KORDIAN GAJEWSKI I HORACY TYRKIEL | Kordian Gajewski i Horacy Tyrkiel | |
LONGIN ZALEWSKA I GEMMA WITKOWSKI | Longin Zalewska i Gemma Witkowski | |
Łukasz i Magdalena Leszewscy | Łukasz Leszewski i Magdalena Leszewska | |
Remarks | This definition splits plural variations of Polish family names into individual feminine and/or masculine variations. It should be used to standardize strings containing multiple names before the Name (Multiple Name) parse definition is used to parse those strings. |
Organization | ||
---|---|---|
Description | Standardizes organization names. | |
Example | Input | Output |
Y M C D SA | YMCD S.A. | |
Amazon.COM | Amazon.com | |
A.T.W.Products Sp. z o.o. | ATW Products sp. z o.o. | |
Remarks |
Phone | ||
---|---|---|
Description | Standardizes phone numbers. | |
Example | Input | Output |
868-58-16 | 868 58 16 | |
0501-712-050 | 501 712 050 | |
6337928 | 633 79 28 | |
Remarks |
Postal Code | ||
---|---|---|
Description | Standardizes postal codes. | |
Example | Input | Output |
12345 | 12-345 | |
11/234 | 11-234 | |
990-00 | 99-000 | |
9-9000 | 99-000 | |
12-123 | 12-123 | |
Remarks |
In addition to the definitions listed on this page, the Polish, Poland locale also inherits all definitions for the Polish language and all Global definitions.
Documentation Feedback: yourturn@sas.com
|
Doc ID: QKBCI_PLPOL_defs.html |