SAS Quality Knowledge Base for Contact Information 25
Definitions for the Hungarian, Hungary locale are described below.
Case Definitions
Gender Analysis Definitions
Identification Analysis Definitions
Match Definitions
Parse Definitions
Pattern Analysis Definitions
Standardization Definitions
Inherited Definitions
Proper (City) | ||
---|---|---|
Description | The Case definition for Proper (City) propercases city names. | |
Examples | Input | Output |
BUDAPEST | Budapest | |
Szeged-tápé | Szeged-Tápé | |
Remarks |
Proper (Name) | ||
---|---|---|
Description | The Case definition for Proper (Name) propercases names. | |
Examples | Input | Output |
Kis-juhász Béla | Kis-Juhász Béla | |
KOVÁCS ISTVÁN | Kovács István | |
Szabó ferenc | Szabó Ferenc | |
Remarks |
Proper (Organization) | ||
---|---|---|
Description | The Case definition for Proper (Organization) propercases organization names. | |
Examples | Input | Output |
ECODEV Kft. | Ecodev Kft. | |
aranyalma bt. | Aranyalma Bt. | |
Remarks |
Name | ||
---|---|---|
Description | The Gender Analysis definition for Name determines an individual's gender based on a name. | |
Possible Outputs | M F U |
|
Examples | Input | Output |
Kovács János | M | |
Kovács Jánosné | F | |
John Smith | M | |
Gálfi G. | U | |
Remarks |
Individual/Organization | ||
---|---|---|
Description | The Individual/Organization identification analysis definition determines whether a string represents the name of an individual or an organization. | |
Possible Outputs | ORGANIZATION INDIVIDUAL UNKNOWN |
|
Examples | Input | Output |
Kovács János | INDIVIDUAL | |
Kovács János Bt. | ORGANIZATION | |
MOL Rt. | ORGANIZATION | |
MOL | UNKNOWN | |
Remarks |
Name | ||
---|---|---|
Description | The Name identification analysis definition determines whether a string represents a correct Hungarian name. | |
Possible Outputs | HUN FOREIGN OTHER |
|
Input | Output | |
Examples | Kovács János | HUN |
Kovács Jánso | FOREIGN | |
John Smith | FOREIGN | |
Szimán | OTHER | |
Remarks | The Identification Analysis definition for Name is used to identify incorrect Hungarian names. It has a limited capability to distinguish between Hungarian and foreign names. A misspelled given name with a common family name might be identified as "FOREIGN" rather than "OTHER" (see the second example). |
Address | ||
---|---|---|
Description | The Address match definition generates match codes which can be used to cluster records containing addresses. | |
Max Length of Match Code | 22 characters | |
Examples | Input | Cluster ID |
Cserje u. 18 fszt. 1. | 0 | |
Bajcsy Zs. u. 21-23 | 1 | |
Bajcsi Zsilinszky. út 21/a. | 1 | |
Nagyvárad tér 1/b. | 2 | |
Remarks |
Diacritics transliteration is applied at a sensitivity level of 84 and below. |
|
Note: The results listed above reflect the default match sensitivity (85). |
Address (Full) | ||
---|---|---|
Description | The Address (Full) match definition generates match codes which can be used to cluster records containing complete two-line addresses. | |
Max Length of Match Code | 41 characters | |
Examples | Input | Cluster ID |
1025 Budapest, Cserje u. 18. | 0 | |
1051 Bp., Bajcsy-Zsilinszky E. u. 21-23. | 1 | |
1051 Budapest, Bajcsy-Zs. E. u. 21-23. | 1 | |
9092 Győr, Arany J. tér 19. | 2 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
City | ||
---|---|---|
Description | The City match definition generates match codes which can be used to cluster records containing city names. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
TÁPÉ | 0 | |
SZEGED-TÁPÉ | 0 | |
BUDAPEST | 1 | |
BUDA-PEST | 1 | |
BP | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
City - State/Province - Postal Code | ||
---|---|---|
Description | The City - State/Province - Postal Code match definition generates match codes which can be used to cluster records containing last line address information. | |
Max Length of Match Code | 19 characters | |
Examples | Input | Cluster ID |
1025 Budapest | 0 | |
1038 Bp | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
ID Number | ||
---|---|---|
Description | The ID Number match definition generates match codes which can be used to cluster records containing ID card numbers. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
AU-VII. 255435 | 0 | |
AU255435 | 0 | |
255435AU | 0 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Name | ||
---|---|---|
Description | The Name match definition generates match codes which can be used to cluster records containing names of individuals. | |
Max Length of Match Code | 20 characters | |
Examples | Input | Cluster ID |
dr. Kovács István | 0 | |
Kováts István | 0 | |
özv. Kovács Istvánné | 0 | |
Kovács Istvánné Szabó Mária | 1 | |
Szabó Mária | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
|
Note that this definition does not produce the same match code for two strings such that the first is a man's name and the second includes the man's name as the marriage name portion of a woman's name. To match names according to the marriage name, use the Name (Marriage Name Only) match definition. |
Name (Marriage Name Only) | ||
---|---|---|
Description | The Name (Marriage Name Only) match definition generates match codes which can be used to cluster records containing names of individuals, based on the marriage name portion of the name string. | |
Max Length of Match Code | 20 characters | |
Examples | Input | Cluster ID |
özv. Kovács Istvánné | 0 | |
Kovács Istvánné Mária | 0 | |
Kovács Istvánné Szabó Mária | 0 | |
Kovács Istvánné Judit | 0 | |
Bakó Istvánné Szabó Mária | 1 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
|
Note that this definition might cause some false positives, as when strings representing two women married to men of the same name produce the same match code. Also, this definition will not match two names if the names do not share marriage names -- even if other portions of the names are identical. |
Organization | ||
---|---|---|
Description | The Organization match definition generates match codes which can be used to cluster records containing organization names. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
Macska 2007 Kereskedelmi és Szolgáltató kft | 0 | |
Macska bt. | 1 | |
Macska Kereskedelmi és Szolgáltató kft | 1 | |
R and R Kft. | 2 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Phone | ||
---|---|---|
Description | The Phone match definition generates match codes which can be used to cluster records containing phone numbers. | |
Max Length of Match Code | 16 characters | |
Examples | Input | Cluster ID |
+36-1-326-0573 | 0 | |
326-05-73 | 0 | |
30/9456-455 | 1 | |
06-30-945-6455 | 1 | |
06-70-4532321 | 2 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Postal Code | ||
---|---|---|
Description | The Postal Code match definition generates match codes which can be used to cluster records containing postal codes. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
1025. | 0 | |
1 0 2 5 | 0 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Text | ||
---|---|---|
Description | The Text match definition generates match codes which can be used to cluster records containing general text strings. | |
Max Length of Match Code | 15 characters | |
Examples | Input | Cluster ID |
Dell Számítógép | 0 | |
DELL-SZÁMITOGEP | 0 | |
Remarks |
Note: The results listed above reflect the default match sensitivity (85). |
Address | |||
---|---|---|---|
Description | The Parse definition for Address parses addresses. | ||
Output Tokens | Street Name Street Type Building Number Extension Extension Number Additional Information |
||
Input | Output | ||
Example 1 | Cserje u. 18 fszt. 1. | Street Name | Cserje |
Street Type | u | ||
Building Number | 18 | ||
Extension | fszt. 1 | ||
Extension Number | |||
Additional Information | |||
Input | Output | ||
Example 2 | József A. ltp. 4/C 1/3 | Street Name | József A |
Street Type | ltp | ||
Building Number | 4/C | ||
Extension | 1/3 | ||
Extension Number | |||
Additional Information | |||
Remarks |
Address (Full) | |||
---|---|---|---|
Description | The Parse definition for Address (Full) parses complete two-line addresses. | ||
Output Tokens | Street Name Street Type Building Number Extension Extension Number Postal Code City Additional Information |
||
Input | Output | ||
Example 1 | 2324 Szeged-Tápé, Kossuth u. 13. | Street Name | Kossuth |
Street Type | u | ||
Building Number | 13 | ||
Extension | |||
Extension Number | |||
Postal Code | 2324 | ||
City | Szeged-Tápé | ||
Additional Information | |||
Input | Output | ||
Example 2 | 1038 Budapest, Magyar Televízió PF:138 | Street Name | Magyar Televízió |
Street Type | |||
Building Number | |||
Extension | PF | ||
Extension Number | 138 | ||
Postal Code | 1038 | ||
City | Budapest | ||
Additional Information | |||
Remarks |
Address (Global) | |||
---|---|---|---|
Description |
The Address (Global) parse definition parses addresses into a globally recognized set of tokens. |
||
Output Tokens | Recipient Building/Site Street Extension PO Box Additional Info |
||
Input | Output | ||
Example | Cserje u. 18. fszt. 1. | Recipient | |
Building/Site | |||
Street | Cserje u. 18 | ||
Extension | fszt. 1 | ||
PO Box | |||
Additional Info | |||
Remarks |
Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
||
The Address (Global) (v23) parse definition is now deprecated and will be removed in a future release of the QKB. The Address (Global) parse definition has been replaced with a copy of the Address (Global) (v23) definition which takes advantage of the new tokens and updated processing. If you changed your jobs to use Address (Global) (v23) it is suggested that you change them back. |
Address (Global) (v23) | |||
---|---|---|---|
Description |
The Address (Global) (v23) parse definition parses addresses into a globally recognized set of tokens. |
||
Output Tokens | Recipient Building/Site Street Extension PO Box Additional Info |
||
Input | Output | ||
Example | Cserje u. 18. fszt. 1. | Recipient | |
Building/Site | |||
Street | Cserje u. 18 | ||
Extension | fszt. 1 | ||
PO Box | |||
Additional Info | |||
Remarks |
Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
||
The Address (Global) (v23) parse definition is now deprecated and will be removed in a future release of the QKB. The Address (Global) parse definition has been replaced with a copy of the Address (Global) (v23) definition which takes advantage of the new tokens and updated processing. If you changed your jobs to use Address (Global) (v23) it is suggested that you change them back. |
City - State/Province - Postal Code | |||
---|---|---|---|
Description | The Parse definition for City - State/Province - Postal Code parses address "last line" data. | ||
Output Tokens | Postal Code City |
||
Input | Output | ||
Example | 1025 Budapest | Postal Code | 1025 |
City | Budapest | ||
Remarks |
City - State/Province - Postal Code (Global) | |||
---|---|---|---|
Description | The Parse definition for City - State/Province - Postal Code (Global) parses address "last line" data into a globally recognized set of tokens. | ||
Output Tokens | City State/Province Postal Code Additional Info |
||
Input | Output | ||
Examples | 1025 Budapest | City | Budapest |
State/Province | |||
Postal Code | 1025 | ||
Additional Info | |||
Remarks |
Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
Name | |||
---|---|---|---|
Description | The Parse definition for Name parses names of individuals. | ||
Output Tokens | Prefix Marriage Name Family Name Given Name Suffix Title/Additional Info |
||
Input | Output | ||
Example 1 | dr. Kiss Józsefné | Prefix | dr. |
Marriage Name | |||
Family Name | Kiss | ||
Given Name | Józsefné | ||
Suffix | |||
Title/Additional Info | |||
Input | Output | ||
Example 2 | ifj. Kiss Lászlóné dr. Sipos Gabriella | Prefix | dr. |
Marriage Name | ifj. Kiss Lászlóné | ||
Family Name | Sipos | ||
Given Name | Gabriella | ||
Suffix | |||
Title/Additional Info | |||
Input | Output | ||
Example 3 | Dézsi István Zoltán (ifj.) | Prefix | ifj. |
Marriage Name | |||
Family Name | Dézsi | ||
Given Name | István Zoltán | ||
Suffix | |||
Title/Additional Info | |||
Remarks | When a woman's name is represented as her husband's name extended with the suffix -né, then the husband's family name and given name are parsed into the Family Name and Given Name tokens (see example 1). If a woman's name contains her maiden family name, then the woman's maiden name and given name are parsed into Family Name and Given Name, and the husband's name is parsed into the Marriage Name token (example 2). If a woman's name appears with the woman's maiden name and the husband's name, and the husband's name has a title or generational indicator, the husband's title or generational indicator is parsed into the Marriage Name token along with the rest of the husband's name (example 2). If the title appears between the husband's name and a woman's maiden name, the title is considered to apply to the woman, and is therefore parsed into the Prefix token (example 2). Generational indicator words appearing with a man's name are always parsed into the Prefix token (example 3). |
Name (Global) | |||
---|---|---|---|
Description | The Parse definition for Name (Global) parses names of individuals into a globally recognized set of tokens. | ||
Output Tokens | Prefix Given Name Middle Name Family Name Suffix Title/Additional Info |
||
Input | Output | ||
Example 1 | dr. Kiss Józsefné | Prefix | dr. |
Given Name | Józsefné | ||
Middle Name | |||
Family Name | Kiss | ||
Suffix | |||
Title/Additional Info | |||
Input | Output | ||
Example 2 | ifj. Kiss Lászlóné dr. Sipos Gabriella | Prefix | dr. |
Given Name | Gabriella | ||
Middle Name | |||
Family Name | Sipos | ||
Suffix | |||
Title/Additional Info | ifj. Kiss Lászlóné | ||
Input | Output | ||
Example 3 | Dézsi István Zoltán (ifj.) | Prefix | |
Given Name | István Zoltán | ||
Middle Name | |||
Family Name | Dézsi | ||
Suffix | ifj. | ||
Title/Additional Info | |||
Remarks | When a woman's name is represented as her husband's name extended with the suffix -né, then the husband's family name and given name are parsed into the Family Name and Given Name tokens (see example 1). If a woman's name contains her maiden family name, then the woman's maiden name and given name are parsed into Family Name and Given Name, and the husband's name is parsed into the Title/Additional Info token (Example 2). If a woman's name appears with the woman's maiden name and the husband's name, and the husband's name has a title or generational indicator, the husband's title or generational indicator is parsed into the Title/Additional Info token along with the rest of the husband's name (example 2). If the title appears between the husband's name and a woman's maiden name, the title is considered to apply to the woman, and is therefore parsed into the Prefix token (example 2). Generational indicator words appearing with a man's name are always parsed into the Suffix token (example 3). The Middle Name token is not used. Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
Name (Multiple Name) | |||
---|---|---|---|
Description | The Parse definition for Name (Multiple Name) parses strings that contain the names of one or two individuals. | ||
Output Tokens | Name 1 Name 2 |
||
Input | Output | ||
Example 1 | László Kővári / József Barsi | Name 1 | László Kővári |
Name 2 | József Barsi | ||
Input | Output | ||
Example 2 | Demjén Zsolt és Szabó Albert | Name 1 | Demjén Zsolt |
Name 2 | Szabó Albert | ||
Remarks |
Name (with Infix) | |||
---|---|---|---|
Description | The Parse definition for Name (with Infix) parses names of individuals, using an Infix token for embedded title words. | ||
Output Tokens | Prefix Marriage Name Infix Family Name Given Name Suffix Title/Additional Info |
||
Input | Output | ||
Example 1 | dr. Kiss Józsefné | Prefix | dr. |
Marriage Name | |||
Infix | |||
Family Name | Kiss | ||
Given Name | Józsefné | ||
Suffix | |||
Title/Additional Info | |||
Input | Output | ||
Example 2 | ifj. Kiss Lászlóné dr. Sipos Gabriella | Prefix | ifj. |
Marriage Name | Kiss Lászlóné | ||
Infix | dr. | ||
Family Name | Sipos | ||
Given Name | Gabriella | ||
Suffix | |||
Title/Additional Info | |||
Input | Output | ||
Example 3 | Dézsi István Zoltán (ifj.) | Prefix | ifj. |
Marriage Name | |||
Infix | |||
Family Name | Dézsi | ||
Given Name | István Zoltán | ||
Suffix | |||
Title/Additional Info | |||
Remarks | When a woman's name is represented as her husband's name extended with the suffix -né, then the husband's family name and given name are parsed into the Family Name and Given Name tokens (see example 1). If a woman's name contains her maiden family name, then the woman's maiden name and given name are parsed into Family Name and Given Name, and the husband's name is parsed into the Marriage Name token (example 2). If a name contains a title at the beginning of the name, the title is parsed into the Prefix token (examples 1 and 2). If the title appears between the husband's name and a woman's maiden name, the title is parsed into the Infix token (example 2). Generational indicator words such as ifj are always parsed into the Prefix token (example 3). |
Organization | |||
---|---|---|---|
Description | The Parse definition for Organization parses strings that contain organization names. | ||
Output Tokens | Name Legal Form Additional Info Site Description |
||
Input | Output | ||
Example | ECODEV Gazdasági Fejlesztő és Tanácsadó Kft. | Name | ECODEV |
Legal form | Kft | ||
Additional Info | |||
Site | |||
Description | Gazdasági Fejlesztő és Tanácsadó | ||
Remarks | The Site token is unused. It is reserved for further development. |
Phone | |||
---|---|---|---|
Description | The Parse definition for Phone parses Hungarian phone numbers. | ||
Output Tokens | Prefix Country Code Area Code Base Number Extension |
||
Input | Output | ||
Example 1 | 06-85-560-020/123 | Prefix | |
Country Code | |||
Area Code | 85 | ||
Base Number | 560020 | ||
Extension | 123 | ||
Input | Output | ||
Example 2 | fax: 82/553-113 | Prefix | fax |
Country Code | |||
Area Code | 82 | ||
Base Number | 553113 | ||
Extension | |||
Remarks |
Phone (Global) | |||
---|---|---|---|
Description | The Parse definition for Phone (Global) parses phone numbers into a globally recognized set of tokens. | ||
Output Tokens | Country Code Area Code Base Number Extension Line Type Additional Info |
||
Input | Output | ||
Example 1 | 06-85-560-020/123 | Country Code | |
Area Code | 85 | ||
Base Number | 560020 | ||
Extension | 123 | ||
Line Type | |||
Additional Info | |||
Input | Output | ||
Example 2 | fax: 82/553-113 | Country Code | |
Area Code | 82 | ||
Base Number | 553113 | ||
Extension | |||
Line Type | fax | ||
Additional Info | |||
Remarks |
Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
Hungarian Word Analysis | ||
---|---|---|
Description | The Pattern Analysis definition for Hungarian Word Analysis determines the pattern of words in the input string. | |
Examples | Input | Output |
Árvíztűrő tüköfúrógép | A A | |
1941-ben ismertem meg Dániel | 9*A A A A | |
MB3 Kft. | M A* | |
Grätzer György | X A | |
Remarks | The Hungarian Word Analysis is similar to Word Analysis, but is used to designate words with some non-Hungarian letters (for example, German, Slovakian, or Polish special characters) by an X. |
Address | ||
---|---|---|
Description | The Standardization definition for Address standardizes address data. | |
Examples | Input | Output |
KOSSUTH TÉR 11 | Kossuth Lajos tér 11. | |
Széchenyi u. 1/A | Széchenyi István utca 1/a. | |
pf 89 | Pf.: 89 | |
Remarks |
Address (Full) | ||
---|---|---|
Description | The Standardization definition for Address (Full) standardizes complete two-line address data. | |
Input | Output | |
Example | 1025 Bp Cserje u. 18 | 1025 Budapest, Cserje utca 18. |
Remarks |
City | ||
---|---|---|
Description | The Standardization definition for City standardizes city names. | |
Input | Output | |
Examples | Bp | Budapest |
Szfvar | Székesfehérvár | |
SZEGED-TAPE | Szeged-Tápé | |
Remarks |
City - State/Province - Postal Code | ||
---|---|---|
Description | The Standardization for City - State/Province - Postal Code standardizes city name and postal code combinations. | |
Input | Output | |
Examples | 1245 bp | 1245 Budapest |
3234GYOR | 3234 Győr | |
Remarks |
Name | ||
---|---|---|
Description | The Standardization definition for Name standardizes names of individuals. | |
Input | Output | |
Examples | DR. KOVÁCS P JÓZSEF | dr Kovács P József |
Bánáti János (Ifj.) | ifj Bánáti János | |
Remarks |
Organization | ||
---|---|---|
Description | The Standardization definition for Organization standardizes organization names. | |
Input | Output | |
Examples | ARANYALMA 2000 BETÉTI TÁRS | ARANYALMA 2000 Bt. |
Zsivány Ker És Szolg KFT | Zsivány Kereskedelmi És Szolgáltató Kft. | |
Remarks |
Phone | ||
---|---|---|
Description | The Standardization definition for Phone standardizes Hungarian phone numbers. | |
Input | Output | |
Examples | 06-85-560-020/123 | 85-560020/123 |
fax: 82/553-113 | Fax:82-553113 | |
Remarks |
Postal Code | ||
---|---|---|
Description | The Standardization definition for Postal Code standardizes postal codes. | |
Input | Output | |
Example | 1025Bp | 1025 |
Remarks |
In addition to the definitions listed on this page, the Hungarian, Hungary locale also inherits all definitions for the Hungarian language and all Global definitions.
Documentation Feedback: yourturn@sas.com
|
Doc ID: QKBCI_HUHUN_defs.html |