You are here: Definitions>Russian Definitions>Russian, Russia Definitions

SAS Quality Knowledge Base for Contact Information 26

Russian, Russia Definitions

Definitions for the Russian, Russia locale are described below.

Case Definitions
Extraction Definitions
Gender Analysis Definitions

Identification Analysis Definitions

Match Definitions

Parse Definitions

Pattern Analysis Definitions

Standardization Definitions

Inherited Definitions

Case Definitions

Proper (Name)
Description The Proper (Name) case definition propercases names of individuals.
Examples Input Output
Александр Бородин Александр Бородин
николай григорьевич рубинштейн Николай Григорьевич Рубинштейн
РИМСКИЙ-КОРСАКОВ НИКОЛАЙ Римский-Корсаков Николай
Remarks  

Extraction Definitions

None.

Gender Analysis Definitions

Name
Description The Name gender analysis definition determines the gender of a name.
Possible Outputs M, F, U
Examples Input Output
Пафнутий Львович Чебышев M
Софья Ковалевская F
Арнольд В.И. U
Remarks  

Identification Analysis Definitions

City
Description The City identification analysis definition determines if a string represents a Russian City.
Possible Outputs CITY, UNK
Examples Input Output
Москва CITY
Пушкин CITY
Вентилятор UNK
Remarks  

 

Individual/Organization
Description The Individual/Organization identification analysis definition determines whether a string represents the name of an individual or an organization.
Possible Outputs Organization, Individual, Unknown
Examples Input Output
Федор Достоевский Name
ОАО У Швейка Organization
ООО Промкнигторг Organization
Remarks  

Match Definitions

Address
Description The Address match definition generates match codes which can be used to cluster records containing addresses.
Max Length of Match Code 27 characters
Examples Input Cluster ID
Verkhniy Tagansky Tupik, 4 0
Verhniy Taganskiy tup., 4 0
Тимура Фрунзе, д. 12, кв. 34 1
Тимура Фрунзе, 12, 34 1
ул. Новорязанская.,31/7, 3-й этаж, building 2 2
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Address (Full)
Description The Address (Full) match definition generates match codes which can be used to cluster records containing complete two-line addresses.
Max Length of Match Code 27 characters
Examples Input Cluster ID
119146 Москва, Комсомольский пр-т, дом 14/2, кв. 58 0
119200 Москва, Комсомольский пр-т, дом 14/4, кв. 70 1
123100, Москва, улица 1905 года, 1, кв 15 2
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Address (PO Box Only)
Description The Address (PO Box Only) match definition generates match codes which can be used to cluster records containing the PO Box portion of an address.
Max Length of Match Code 15 characters
Examples Input Cluster ID
ул. Новая Басманная, д. 2, а/я 12 0
ул. Тверская 5, а/я 12 0
  Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

The Address (PO Box Only) match definition ignores street name information.

 

Address (Street Only)
Description The Address (Street Only) match definition generates match codes which can be used to cluster records containing the street portion of an address.
Max Length of Match Code 23 characters
Examples Input Cluster ID
ул. Новая Басманная, д. 2, а/я 12 1
ул. Новая Басманная, д. 2, а/я 17 1
ул. Новая Басманная, д. 2 1
  Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

The Address (Street Only) match definition ignores PO Box information.

 

City
Description The City match definition generates match codes which can be used to cluster records containing city names.
Max Length of Match Code 15 characters
Examples Input Cluster ID
город Дубна 1
г. Дубна 1
Новосибирск 2
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

City - State/Province - Postal Code
Description The City - State/Province - Postal Code match definition generates match codes which can be used to cluster records containing last line address information.
Max Length of Match Code 32 characters
Examples Input Cluster ID
ТЮМЕНСКАЯ ОБЛ,ЯМАЛО-НЕНЕЦКИЙ АВТ ОКРУГ Г ПЫТЬ-ЯХ 1
ТЮМЕНСКАЯ ОБЛ,ХАНТЫ-МАНСИЙСКИЙ АВТ ОКРУГ Г ПЫТЬ-ЯХ 1
121069 RUSSIA MOSCOW UL B MOLCHANOVKA, 36 2
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Name
Description The Name match definition generates match codes which can be used to cluster records containing names of individuals.
Max Length of Match Code 24 characters
Examples Input Cluster ID
Agafonova Anna 1
Агафонова Анна 1
Aleksandr Bukatin 2
Саша Букатин 2
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Organization
Description The Organization match definition generates match codes which can be used to cluster records containing organization names.
Max Length of Match Code 60 characters
Examples Input Cluster ID
ООО "У пескаря" 0
ПБОЮЛ Сидорова А.Н. 1
Банк Жилищного Финансирования 2
БАНК ЖИЛИЩНОГО ФИНАНСИРОВАНИЯ 2
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Passport
Description The Passport match definition generates match codes which can be used to cluster records containing passport numbers.
Max Length of Match Code 15 characters
Examples Input Cluster ID
Паспорт 65 00 523259 1
серия 6500 номер 523259 1
98 56 659821 2
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Phone
Description The Phone match definition generates match codes which can be used to cluster records containing phone numbers.
Max Length of Match Code 15 characters
Examples Input Cluster ID
32-65-99 1
8(8352)32-65-99 1
+79093129896 2
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

 

Postal Code
Description The Postal Code match definition generates match codes which can be used to cluster records containing postal codes.
Max Length of Match Code 15 characters
Examples Input Cluster ID
123100 1
123104 1
607750 2
Remarks

NoteNote: The results listed above reflect the default match sensitivity (85).

Parse Definitions

Address
Description The Address parse definition parses addresses into a set of tokens.
Output Tokens Street
House
Building
Flat
PO Box
Additional Info
Example 1 Input Output Token Output
ул. Профсоюзная, дом 45, этаж 8 кв. 34 а/я 34 Street ул. Профсоюзная
House дом 45
Building  
Flat кв 34
PO Box а/я 34
Additional Info этаж 8
Example 2 Input Output Token Output
пр-т Воробьёвых, 18 корпус 4, 9 Street пр-т Воробьёвых
House 18
Building 4
Flat 9
PO Box  
Additional Info  
Remarks  

 

Address (Full)
Description The Address (Full) parse definition parses addresses containing complete two-line addresses into a set of tokens.
Output Tokens Postal Code
Country
Region
Province
City
Street
House
Building
Flat
PO Box
Additional Info
Example 1 Input Output Token Output
185547 Тверская обл. г. Торопец ул. Лесная дом 4, офис 12 Postal Code 185547
Country  
Region Тверская обл.
Province  
City г. Торопец
Street ул. Лесная
House дом 4
Building  
Flat офис 12
PO Box  
Additional Info  
Example 2 Input Output Token Output
г. Волгоград ул. Московская дом 48 корпус 3, кв. 14 этаж 2, а/я 65975 Postal Code 185547
Country  
Region  
Province  
City г. Волгоград
Street ул. Московская
House дом 48
Building 3
Flat кв. 14
PO Box а/я 65975
Additional Info этаж 2
Remarks  

 

Address (Global)
Description

The Address (Global) parse definition parses addresses into a globally recognized set of tokens.

Output Tokens Recipient
Building/Site
Street
Extension
PO Box
Additional Info
  Input Output Token Output
Example 1 Пр-т Юных Ленинцев, 78, кв 23, этаж 3 Recipient  
Building/Site  
Street Пр-т Юных Ленинцев, 78
Extension кв 23, этаж 3
PO Box  
Additional Info  
  Input Output Token Output
Example 2 ул. Профсоюзная, дом 45, этаж 8 кв. 34 а/я 34 Recipient  
Building/Site  
Street ул. Профсоюзная, дом 45
Extension этаж 8 кв. 34
PO Box а/я 34
Additional Info  
  Input Output Token Output
Example 3 пр-т Воробьёвых, 18 корпус 4, 9 Recipient  
Building/Site  
Street пр-т Воробьёвых, 18
Extension корпус 4, 9
PO Box  
Additional Info  
Remarks Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales.

 

City - State/Province - Postal Code
Description The City - State/Province - Postal Code parse definition parses last line address information into a set of tokens.
Output Tokens Postal Code
Country
Region
Province
City
Example Input Output Token Output
258488, Новосибирская область, г. Бердск Postal Code 258488
Country  
Region Новосибирская область
Province  
City г. Бердск
Remarks  

 

City - State/Province - Postal Code (Global)
Description The City - State/Province - Postal Code (Global) parse definition parses last line address information into a globally recognized set of tokens.
Output Tokens City
State/Province
Postal Code
Additional Info
Example Input Output Token Output
258488, Новосибирская область, г. Бердск City г. Бердск
State/Province Новосибирская область
Postal Code 258488
Additional Info  
Remarks Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales.

 

Name
Description The Name parse definition parses names of individuals into a set of tokens.
Output Tokens Prefix
Given Name
Patronym
Family Name
Additional Info
Example 1 Input Output Token Output
ФОКИН АНДРЕЙ ВЛАДИМИРОВИЧ Prefix  
Given Name АНДРЕЙ
Patronym ВЛАДИМИРОВИЧ
Family Name ФОКИН
Additional Info  
Example 2 Input Output Token Output
Профессор Ф.Ф. Преображенский Prefix  
Given Name Ф
Patronym Ф
Family Name Преображенский
Additional Info Профессор
Example 3 Input Output Token Output
Доцент г-н Дальф Ёжиков Prefix г-н
Given Name Дальф
Patronym  
Family Name Ёжиков
Additional Info Доцент
Remarks  

 

Name (Global)
Description The Name (Global) parse definition parses names of individuals into a globally recognized set of tokens.
Output Tokens Prefix
Given Name
Middle Name
Family Name
Suffix
Title/Additional Info
Example 1 Input Output Token Output
Доктор технических наук госпожа Синичкина Prefix госпожа
Given Name  
Middle Name  
Family Name Синичкина
Suffix  
Title/Additional Info Доктор технических наук
Example 2 Input Output Token Output
Эйншьейн Альберт Германович, член РАН Prefix  
Given Name Альберт
Middle Name Германович
Family Name Эйнштейн
Suffix  
Title/Additional Info член РАН
Remarks Patronymic names are parsed into the Middle Name token.

Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales.

 

Passport
Description

The Passport parse definition parses passport number data into a set of tokens.

Output Tokens Series
Number
Example Input Output Token Output
паспорт №345678 серия 46 00 Series 46 00
Number 345678
Remarks  

 

Phone
Description The Phone parse definition parses phone numbers into a set of tokens.
Output Tokens Prefix
Long Distance Code
Country Code
Area Code
Base Number
Extension
Suffix
Example 1 Input Output Token Output
+7499568-58-96 Prefix  
Long Distance Code  
Country Code +7
Area Code 499
Base Number 568-58-96
Extension  
Suffix  
Example 2 Input

Output Token

Output
8 495 1235485 доб. 1285 Prefix  
Long Distance Code 8
Country Code  
Area Code 495
Base Number 1235485
Extension  
Suffix доб. 1285
Remarks  

 

Phone (Global)
Description The Phone (Global) parse definition parses phone numbers into a globally recognized set of tokens.
Output Tokens Country Code
Area Code
Base Number
Extension
Line Type
Additional Info
Example 1 Input Output Token Output
+7 (812) 3198565 доб. 11 Country Code +7
Area Code 812
Base Number 3198565
Extension  
Line Type  
Additional Info доб. 11
Example 2 Input Output Token Output
fax +7 (812) 3198565 Country Code +7
Area Code 812
Base Number 3198565
Extension  
Line Type  
Additional Info fax
Remarks Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales.

Pattern Analysis Definitions

None.

Standardization Definitions

Address
Description The Address standardization definition standardizes addresses.
Examples Input Output
ул. Карла Маркса, дом 13 стр.2, кв. 15 этаж 4 а/я 14 ул. Карла Маркса, дом 13, корп 2, кв. 15, а/я 14, этаж 4
ул. Тверская дом 5 подъезд 4 этаж 1 кв 45 ул. Тверская, дом 5, кв 45, под 4
Remarks  

 

Address (Full)
Description The Address (Full) standardization definition standardizes complete two line addresses.
Example Input Output
123321, г. норильск, ул. Теплая, 19, корп.2, кв. 14, а/я 125521 123321, г. Норильск, ул. Теплая, д 19, корп 2, Кв 14, а/я 125521
Remarks  

 

City
Description The City standardization definition standardizes city names.
Examples Input Output
ТамБов Тамбов
город Дубна г Дубна
Remarks  

 

City - State/Province - Postal Code
Description The City - State/Province - Postal Code standardization definition standardizes last line address information.
Examples Input Output
958859 Московская область город Серпухов 958859, Московская обл, г Серпухов
198367, КАЛИНИНГРАД 198367, Калининград
Remarks  

 

Name
Description The Name standardization definition standardizes names of individuals.
Examples Input Output
Доктор господин Сидоров А.Р. г-н Сидоров А.Р. доктор
Анатолий Сергеевич ЛОГИНОВ Логинов Анатолий Сергеевич
Remarks  

 

Organization
Description The Organization standardization definition standardizes organization names.
Example Input Output
ООО «ПАРУС» «ПАРУС», ООО
Remarks  

 

Passport
Description The Passport standardization definition standardizes passport numbers.
Example Input Output
95 06 359 569 95-06 359569
Remarks  

 

Phone
Description The Phone standardization definition standardizes phone numbers for domestic use.
Examples Input Output
+7 495 485 95 96 +7 (495) 485-95-96
18568 1-85-68
Remarks  

 

Phone (Replace Obsolete Area Codes)
Description

The Phone (Replace Obsolete Area Codes) standardization definition standardizes phone numbers, replacing old area codes with new area codes.

Example Input Output
(08242) 40537 (08242) 4-05-37
Remarks  

 

Postal Code
Description The Postal Code standardization definition standardizes postal codes.
Example Input Output
«123100» 123100
Remarks  

Inherited Definitions

In addition to the definitions listed on this page, the Russian, Russia locale also inherits all definitions for the Russian language and all Global definitions