You are here: Definitions>Japanese Definitions>Japanese, Japan Definitions

SAS Quality Knowledge Base for Contact Information 25

Japanese, Japan Definitions

Definitions for the Japanese, Japan locale are described below.

Case Definitions
Gender Analysis Definitions
Identification Analysis Definitions
Match Definitions

Parse Definitions

Pattern Analysis Definitions

Standardization Definitions

Inherited Definitions

Case Definitions

Upper (Address)
Description The Case definition for Upper (Address) uppercases Latin characters found in address data.
  Input Output
Examples 1-13-1 イヌイビル・カチドキ 8f 501 1-13-1 イヌイビル・カチドキ 8F 501
3丁目53-2Biz原宿2F 3丁目53-2BIZ原宿2F
2丁目jrタワーオフィス15階札幌駅総合開発内 2丁目JRタワーオフィス15階札幌駅総合開発内
Remarks  

 

Upper (Organization)
Description The Case definition for Upper (Organization) uppercases Latin characters found in organization names. Well-known words are propercased where appropriate.
  Input Output
Examples Aigスター生命保険株式会社 AIGスター生命保険株式会社
cfg株式会社 CFG株式会社
Necエレクトロニクス株式会社 NECエレクトロニクス株式会社
dataflux, a sas company DataFlux, A SAS Company
Remarks Certain well-known company names are propercased.

Gender Analysis Definitions

Name
Description The Gender Analysis definition for Name makes a best guess at the genders of names.
Possible Outputs M
F
U
  Input Output
Examples 鈴木一郎 M
山田花子 F
マークスミス U
Remarks Non-Japanese names are not evaluated; they will produce output 'U'.

Identification Analysis Definitions

Individual/Organization
Description The Identification Analysis definition for Individual/Organization determines whether a string represents the name of an individual or an organization.
Possible Outputs INDIVIDUAL
ORGANIZATION
UNKNOWN
  Input Output
Examples 株式会社ソニー ORGANIZATION
田中健二 INDIVIDUAL
田中健二商店 ORGANIZATION
リチャードパシフィック UNKNOWN
Remarks  

Match Definitions

Address
Description The Address match definition generates match codes which can be used to cluster records containing addresses.
Max Length of Match Code 108 characters
  Input Cluster ID
Example 1

Sensitivities
90-100
1丁目14の9タナカ大宮桜木ビル5F502(総務部内) 1
1丁目14の9タナカ大宮桜木町ビル5F502(開発部内) 2
1-2-3 新宿郵便局 私書箱第456号 3
1-2-3 郵便事業株式会社 新宿支店 私書箱第456号 4
Remarks All components of the address are evaluated. Note that fewer characters in the address are considered as the sensitivity is lowered.
  Input Cluster ID
Example 2

Sensitivities
80-89
1丁目14の9タナカ大宮桜木ビル5F502(総務部内) 1
1丁目14の9タナカ大宮桜木町ビル5F502(開発部内) 1
一丁目四-二二大山生命宇都宮南ビル16階1623号室 2
一丁目四-二二大山生命宇都宮北ビル16階1635号室 3
1-2-3 新宿郵便局 私書箱第456号 4
1-2-3 郵便事業株式会社 新宿支店 私書箱第456号 4
1-2-3 新宿東郵便局 私書箱第456号 5
1-2-3 新宿西郵便局 私書箱第456号 5
Remarks Block, building, floor, room, and PO Box info are evaluated. Different forms of some words will match.
  Input Cluster ID
Example 3

Sensitivities
70-79
一丁目四-二二大山生命宇都宮南ビル16階1623号室 1
一丁目四-二二大山生命宇都宮北ビル16階1635号室 1
2丁目6の3五ツ橋ビル3階302 2
2丁目6の3五ツ橋ビル5階502 3
1-2-3 新宿東郵便局 私書箱第456号 4
1-2-3 新宿西郵便局 私書箱第456号 4
Remarks Block, building, floor, room, and PO Box info are evaluated. Different forms of some words will match. Note that fewer characters in the address are considered as the sensitivity is lowered.
  Input Cluster ID
Example 4

Sensitivities
60-69
2丁目6の3五ツ橋ビル3階302 1
2丁目6の3五ツ橋ビル5階502 1
6丁目27-18ABC横浜別館 2
6丁目27-19ABD会館 3
1-2-3 新宿東郵便局 私書箱第456号 4
1-2-3 新宿西郵便局 私書箱第456号 4
Remarks Block, building, and PO Box info are evaluated. Different forms of some words will match. Note that fewer characters in the address are considered as the sensitivity is lowered.
  Input Cluster ID
Example 5

Sensitivities
50-59
6丁目27-18ABC横浜別館 1
6丁目27-19ABD会館 1
1丁目3-4 2
1丁目3-5 3
Remarks Block and PO Box info are evaluated. Different forms of some words will match. Note that fewer characters in the address are considered as the sensitivity is lowered.

 

Address (Full)
Description The Address (Full) match definition generates match codes which can be used to cluster records containing complete two-line addresses.
Max Length of Match Code 195 characters
  Input Cluster ID
Example 1

Sensitivities
90-100
104-0054 東京都中央区勝どき1-13-1 イヌイビルカチドキ8F 801号室 (開発部内) 1
104ー0054 とうきょうと中央区勝どき1-13-1 イヌイビル勝どき8階 801号室 2
123-4567 渋谷新宿西郵便局 私書箱第456号 3
123-4567 郵便事業株式会社 渋谷新宿東支店 私書箱第456号 4
Remarks All components of the address are evaluated. Note that fewer characters in the address are considered as the sensitivity is lowered.
  Input Cluster ID
Example 2

Sensitivities
80-89
104-0054 東京都中央区勝どき1-13-1 イヌイビルカチドキ8F 801号室 (開発部内) 1
104ー0054 とうきょうと中央区勝どき1-13-1 イヌイビル勝どき8階 801号室 1
123-4567 渋谷新宿西郵便局 私書箱第456号 2
123-4567 郵便事業株式会社 渋谷新宿東支店 私書箱第456号 2
104-0052 東京都中央区月島1-13-1 ファミリータワー8F 801号室 3
104-0052 中央区月島1-13-1 8F 801号室 4
123-4567 新宿西郵便局 私書箱第456号 5
123-4567 新宿東郵便局 私書箱第456号 5
Remarks All components of the address are evaluated. Note that fewer characters in the address are considered as the sensitivity is lowered.
  Input Cluster ID
Example 3

Sensitivities
70-79
104-0052 東京都中央区月島1-13-1 ファミリータワー8F 801号室 1
104-0052 中央区月島1-13-1 8F 801号室 1
104-0053 中央区月島1-13-1 8F 801号室 2
123-4567 新宿西郵便局 私書箱第456号 3
123-4567 新宿東郵便局 私書箱第456号 3
123-8901 渋谷郵便局 私書箱第456号 4
Remarks Prefecture and building name information are ignored. Note that fewer characters in the address are considered as the sensitivity is lowered.
  Input Cluster ID
Example 4

Sensitivities
60-69
104-0052 中央区月島1-13-1 8F 801号室 1
104-0053 中央区月島1-13-1 8F 801号室 1
123-4567 新宿東郵便局 私書箱第456号 2
123-8901 渋谷郵便局 私書箱第456号 2
460-1234 名古屋市緑区1-13-1 8F 801号室 3
460-5678 名古屋市中区1-13-2 9F 901号室 4
Remarks Prefecture, building name, and PO Box information are ignored. Note that fewer characters in the address are considered as the sensitivity is lowered.
  Input Cluster ID
Example 5

Sensitivities
50-59
460-1234 名古屋市緑区1-13-1 8F 801号室 1
460-5678 名古屋市中区1-13-2 9F 901号室 1
Remarks Only postal code, primary city, block, and PO Box numbers are evaluated. Note that fewer characters in the address are considered as the sensitivity is lowered.

 

Address (PO Box Only)
Description The Address (PO Box Only) match definition generates match codes which can be used to cluster records containing the PO Box portion of an address.
Max Length of Match Code 40 characters
  Input Cluster ID
Example 1

Sensitivities
90-100
郵便事業株式会社 新東京支店 私書箱123号 1
郵便事業株式会社 新東京支店 私書箱123号 1
新東京郵便局 私書箱123号 2
Remarks Full-width and half-width expressions match. Note that fewer characters in the address are considered as the sensitivity is lowered.
  Input Cluster ID
Example 2

Sensitivities
70-89
郵便事業株式会社 新東京支店 私書箱123号 1
新東京郵便局 私書箱123号 1
横浜北郵便局 私書箱1234号 2
横浜南郵便局 私書箱1234号 3
Remarks Different representations of the same post office match. Note that fewer characters in the address are considered as the sensitivity is lowered.
  Input Cluster ID
Example 3

Sensitivities
50-69
横浜北郵便局 私書箱1234号 1
横浜南郵便局 私書箱1234号 1
Remarks Only PO Box numbers are evaluated. Post office names are ignored. Note that fewer characters in the address are considered as the sensitivity is lowered.

 

Address (Street Only)
Description The Address (Street Only) match definition generates match codes which can be used to cluster records containing the street portion of an address.
Max Length of Match Code 68 characters
  Input Cluster ID
Example 1

Sensitivities
90-100
1丁目14の9タナカ大宮桜木ビル5F502(総務部内) 1
1丁目14の9タナカ大宮桜木町ビル5F502(開発部内) 2
Remarks All components of the address are evaluated except for PO Box info. Note that fewer characters in the address are considered as the sensitivity is lowered.
  Input Cluster ID
Example 2

Sensitivities
80-89
1丁目14の9タナカ大宮桜木ビル5F502(総務部内) 1
1丁目14の9タナカ大宮桜木町ビル5F502(開発部内) 1
一丁目四-二二大山生命宇都宮南ビル16階1623号室 2
一丁目四-二二大山生命宇都宮北ビル16階1635号室 3
Remarks Block, building, floor, and room information are evaluated. Note that fewer characters in the address are considered as the sensitivity is lowered.
  Input Cluster ID
Example 3

Sensitivities
70-79
一丁目四-二二大山生命宇都宮南ビル16階1623号室 1
一丁目四-二二大山生命宇都宮北ビル16階1635号室 1
2丁目6の3五ツ橋ビル3階302 2
2丁目6の3五ツ橋ビル5階502 3
Remarks Block, building, floor, and room information are evaluated. Note that fewer characters in the address are considered as the sensitivity is lowered.
  Input Cluster ID
Example 4

Sensitivities
60-69
2丁目6の3五ツ橋ビル3階302 1
2丁目6の3五ツ橋ビル5階502 1
6丁目27-18ABC横浜別館 2
6丁目27-19ABD会館 3
Remarks Only block and building information are evaluated. Note that fewer characters in the address are considered as the sensitivity is lowered.
  Input Cluster ID
Example 5

Sensitivities
50-59
6丁目27-18ABC横浜別館 1
6丁目27-19ABD会館 1
1丁目3-4 2
1丁目3-5 3
Remarks Only block information is evaluated. Note that fewer characters in the address are considered as the sensitivity is lowered.

 

City
Description The City match definition generates match codes which can be used to cluster records containing city names.
Max Length of Match Code 80 characters
  Input Cluster ID
Example 1

Sensitivities
90-100
なごやしちくさくいまいけみなみ 1
ナゴヤシチクサクイマイケミナミ 1
名古屋市千種区今池南 1
札幌市白石区菊水元町八条 2
札幌市白石区菊水元町九条 3
Remarks City and town names are evaluated. Kanji and kana are matched.
  Input Cluster ID
Example 2

Sensitivities
85-89
札幌市白石区菊水元町八条 1
札幌市白石区菊水元町九条 1
札幌市豊平区美園三条 2
札幌市豊平区美園四条 3
Remarks City and town names are evaluated. Kanji and kana are matched. Note that fewer characters in the address are considered as the sensitivity is lowered.
Input Cluster ID
Example 3

Sensitivities
75-84
札幌市白石区菊水元町八条 1
札幌市豊平区美園三条 2
札幌市豊平区美園四条 2
Remarks City and town names are evaluated. Kanji and kana are matched. Note that fewer characters in the address are considered as the sensitivity is lowered.
  Input Cluster ID
Example 4

Sensitivities
65-74
札幌市白石区菊水元町八条 1
札幌市白石区北郷三条 1
札幌市中央区中島公園 2
利尻郡利尻町 3
利尻郡利尻富士町 4
帯広市空港南町 5
Remarks City names are evaluated. Town names are ignored. Kanji and kana are matched.
  Input Cluster ID
Example 5

Sensitivities
55-64
札幌市白石区菊水元町八条 1
札幌市白石区北郷三条 1
札幌市中央区中島公園 2
利尻郡利尻町 3
利尻郡利尻富士町 3
帯広市空港南 4
Remarks City names are evaluated. Town names are ignored. Kanji and kana are matched. Note that fewer characters in the address are considered as the sensitivity is lowered.
  Input Cluster ID
Example 6

Sensitivities
50-54
札幌市白石区菊水元町八条 1
札幌市中央区中島公園 1
帯広市空港南町 2
Remarks Only primary city names are evaluated. Secondary city and town information are ignored.

 

City - State/Province - Postal Code
Description The City - State/Province - Postal Code match definition generates match codes which can be used to cluster records containing last line address information, which typically includes postal code, prefecture, and city information.
Max Length of Match Code 104 characters
  Input Cluster ID
Example 1

Sensitivities
90-100
104-0054 東京都中央区勝どき 1
104-0054 トウキョウトチュウオウクカチドキ 1
アバシリグンオオゾラチョウメマンベツニシ2ジョウ 2
アバシリグンオオゾラチョウメマンベツニシ3ジョウ 3
Remarks All components of the input string are evaluated. Kanji and kana are matched.
  Input Cluster ID
Example 2

Sensitivities
80-89
104-0054 東京都中央区勝どき 1
104-0054 トウキョウトチュウオウクカチドキ 1
ホッカイドウアバシリグンオオゾラチョウメマンベツニシ2ジョウ 2
ホッカイドウアバシリグンオオゾラチョウメマンベツニシ3ジョウ 2
神奈川県横浜市鶴見区市場下町 3
神奈川県横浜市鶴見区市場東中町 4
Remarks All components of the input string are evaluated. Kanji and kana are matched. Note that fewer characters in the address are considered as the sensitivity is lowered.
  Input Cluster ID
Example 3

Sensitivities
70-79
〒230 神奈川県横浜市鶴見区市場下町 1
〒230 神奈川県横浜市鶴見区市場東中町 1
〒230 神奈川県横浜市鶴見区汐入町 1
〒230 神奈川県横浜市磯子区杉田 2
Remarks All components of the input string are evaluated except for the town name. Kanji and kana are matched. Note that fewer characters in the address are considered as the sensitivity is lowered.
  Input Cluster ID
Example 4

Sensitivities
60-69
〒230-0043 神奈川県横浜市鶴見区汐入町 1
〒230-0073 神奈川県横浜市磯子区杉田 1
〒212 神奈川県川崎市幸区南加瀬 2
〒212 神奈川県鎌倉市幸区南加瀬 3
Remarks Only the postal code, prefecture, and primary city name are evaluated. Kanji and kana are matched. Note that fewer characters in the address are considered as the sensitivity is lowered.
  Input Cluster ID
Example 5

Sensitivities
50-59
〒212 神奈川県川崎市幸区南加瀬 1
〒212 神奈川県鎌倉市幸区南加瀬 1
Remarks Only the postal code, prefecture, and primary city name are evaluated. Kanji and kana are matched. Note that fewer characters in the address are considered as the sensitivity is lowered.

 

Date
Description The Date match definition generates match codes which can be used to cluster records containing date information.
Max Length of Match Code 15 characters
  Input Cluster ID
Example 1

Sensitivities
85-100
2001/3/14 1
2001年3月14日 1
3/14/2001 1
14-mar-01 1
H13.3.14 1
平成十三年三月十四日 1
2001/3/14 1
平成元年六月三十日 2
昭和64年六月三十日 2
2001/12/14 3
2001/12/15 4
Remarks All digits of the year, month, and day are evaluated. Full-width and half-with characters match. Kanji numerals and Arabic numerals match. Any separators (including Japanese characters) match. Names of months match the corresponding digits that represent those months. Japanese Nengo years and Western years match. When the day and year are ambiguous, it is assumed that the last number is the year. It is assumed that two-digit sequences in the range 00-29 represent years in the range 2000-2029. It is assumed that two-digit sequences in the range 30-99 represent the years 1930-1999. When a year belongs to two Japanese Nengo, these two Japanese Nengo years match.
  Input Cluster ID
Example 2

Sensitivities
80-84
2001/12/14 1
2001/12/15 1
2001/12/24 2
Remarks All digits of the year and month are evaluated. Only one digit of the day is evaluated. Full-width and half-with characters match. Kanji numerals and Arabic numerals match. Any separators (including Japanese characters) match. Names of months match the corresponding digits that represent those months. Japanese Nengo years and Western years match. When the day and year are ambiguous, it is assumed that the last number is the year. It is assumed that two-digit sequences in the range 00-29 represent years in the range 2000-2029. It is assumed that two-digit sequences in the range 30-99 represent the years 1930-1999. When a year belongs to two Japanese Nengo, these two Japanese Nengo years match.
  Input Cluster ID
Example 3

Sensitivities
75-79
2001/12/15 1
2001/12/24 1
2001/11/14 2
Remarks All digits of the year and month are evaluated. The day is ignored. Full-width and half-with characters match. Kanji numerals and Arabic numerals match. Any separators (including Japanese characters) match. Names of months match the corresponding digits that represent those months. Japanese Nengo years and Western years match. When the day and year are ambiguous, it is assumed that the last number is the year. It is assumed that two-digit sequences in the range 00-29 represent years in the range 2000-2029. It is assumed that two-digit sequences in the range 30-99 represent the years 1930-1999. When a year belongs to two Japanese Nengo, these two Japanese Nengo years match.
  Input Cluster ID
Example 4

Sensitivities
70-74
2001/12/24 1
2001/11/14 1
2001/9/14 2
Remarks All digits of the year are evaluated. Only one digit of the month is evaluated. The day is ignored. Full-width and half-with characters match. Kanji numerals and Arabic numerals match. Any separators (including Japanese characters) match. Names of months match the corresponding digits that represent those months. Japanese Nengo years and Western years match. When the day and year are ambiguous, it is assumed that the last number is the year. It is assumed that two-digit sequences in the range 00-29 represent years in the range 2000-2029. It is assumed that two-digit sequences in the range 30-99 represent the years 1930-1999. When a year belongs to two Japanese Nengo, these two Japanese Nengo years match.
  Input Cluster ID
Example 5

Sensitivities
65-69
2001/11/14 1
2001/9/14 1
2002/12/14 2
Remarks All digits of the year are evaluated. The month and day are ignored. Full-width and half-with characters match. Kanji numerals and Arabic numerals match. Any separators (including Japanese characters) match. Names of months match the corresponding digits that represent those months. Japanese Nengo years and Western years match. When the day and year are ambiguous, it is assumed that the last number is the year. It is assumed that two-digit sequences in the range 00-29 represent years in the range 2000-2029. It is assumed that two-digit sequences in the range 30-99 represent the years 1930-1999. When a year belongs to two Japanese Nengo, these two Japanese Nengo years match.
  Input Cluster ID
Example 6

Sensitivities
60-64
2001/9/14 1
2002/12/14 1
2012/12/14 2
Remarks Only the first three digits of the year are evaluated. The month and day are ignored. Full-width and half-with characters match. Kanji numerals and Arabic numerals match. Any separators (including Japanese characters) match. Names of months match the corresponding digits that represent those months. Japanese Nengo years and Western years match. When the day and year are ambiguous, it is assumed that the last number is the year. It is assumed that two-digit sequences in the range 00-29 represent years in the range 2000-2029. It is assumed that two-digit sequences in the range 30-99 represent the years 1930-1999. When a year belongs to two Japanese Nengo, these two Japanese Nengo years match.
  Input Cluster ID
Example 7

Sensitivities
50-59
2002/12/14 1
2012/12/14 1
1990/12/14 2
Remarks Only the first two digits of the year are evaluated. The month and day are ignored. Full-width and half-with characters match. Kanji numerals and Arabic numerals match. Any separators (including Japanese characters) match. Names of months match the corresponding digits that represent those months. Japanese Nengo years and Western years match. When the day and year are ambiguous, it is assumed that the last number is the year. It is assumed that two-digit sequences in the range 00-29 represent years in the range 2000-2029. It is assumed that two-digit sequences in the range 30-99 represent the years 1930-1999. When a year belongs to two Japanese Nengo, these two Japanese Nengo years match.

 

Name
Description The Name match definition generates match codes which can be used to cluster records containing names of individuals.
Max Length of Match Code 20 characters
  Input Cluster
Example 1

Sensitivities
90-100
田中サチコ 1
田中サチコ 1
田中さちこ 2
渡辺二郎 3
渡邊二郎 4
Remarks The family name and given name are evaluated. Half-width and Full-width Katakana are matched.
  Input Cluster ID
Example 2

Sensitivities
85-89
田中サチコ 1
田中さちこ 1
渡辺二郎 2
渡邊二郎 2
伊藤三郎 3
伊東三郎 4
いとう三郎 5
Remarks The family name and given name are evaluated. "Old style" Kanji representations are matched to modern Kanji representations. Half-width and Full-width Katakana are matched. Katakana, Hiragana, and Romaji are matched.
  Input Cluster ID
Example 3

Sensitivities
80-84
伊藤三郎 1
伊東三郎 1
いとう三郎 1
佐藤美幸 2
佐藤武 3
Remarks The family name and given name are evaluated. "Old style" Kanji representations are matched to modern Kanji representations. Different variations of Kanji are matched. Half-width and Full-width Katakana are matched. Katakana, Hiragana, and Romaji are matched. Some Kanji family names are matched with their Katakana, Hiragana, and Romaji representations. Note that fewer characters in the name are considered as the sensitivity is lowered.
  Input Cluster ID
Example 4

Sensitivities
75-79
伊藤三郎 1
伊東三郎 1
いとう三郎 1
佐藤美幸 2
佐藤武 3
Remarks The family name and given name are evaluated. "Old style" Kanji representations are matched to modern Kanji representations. Different variations of Kanji are matched. Half-width and Full-width Katakana are matched. Katakana, Hiragana, and Romaji are matched. Some Kanji family names are matched with their Katakana, Hiragana, and Romaji representations.
  Input Cluster ID
Example 5

Sensitivities
70-74
伊藤三郎 1
伊東三郎 1
いとう三郎 1
佐藤美幸 2
佐藤武 3
Remarks The family name and given name are evaluated. "Old style" Kanji representations are matched to modern Kanji representations. Different variations of Kanji are matched. Half-width and Full-width Katakana are matched. Katakana, Hiragana, and Romaji are matched. Some Kanji family names are matched with their Katakana, Hiragana, and Romaji representations.
  Input Cluster ID
Example 6

Sensitivities
65-69
伊藤三郎 1
伊東三郎 1
いとう三郎 1
佐藤美幸 2
佐藤武 3
Remarks The family name and given name are evaluated. "Old style" Kanji representations are matched to modern Kanji representations. Different variations of Kanji are matched. Half-width and Full-width Katakana are matched. Katakana, Hiragana, and Romaji are matched. Some Kanji family names are matched with their Katakana, Hiragana, and Romaji representations.
  Input Cluster ID
Example 7

Sensitivities
60-64
伊藤三郎 1
伊東三郎 1
いとう三郎 1
佐藤美幸 2
佐藤武 3
Remarks The family name and given name are evaluated. "Old style" Kanji representations are matched to modern Kanji representations. Different variations of Kanji are matched. Half-width and Full-width Katakana are matched. Katakana, Hiragana, and Romaji are matched. Some Kanji family names are matched with their Katakana, Hiragana, and Romaji representations.
  Input Cluster ID
Example 8

Sensitivities
55-59
佐藤美幸 1
佐藤武 2
Remarks The family name and given name are evaluated. "Old style" Kanji representations are matched to modern Kanji representations. Different variations of Kanji are matched. Half-width and Full-width Katakana are matched. Katakana, Hiragana, and Romaji are matched. Some Kanji family names are matched with their Katakana, Hiragana, and Romaji representations.
  Input Cluster ID
Example 9

Sensitivities
50-54
佐藤美幸 1
佐藤武 1
Remarks Only the family name is evaluated. "Old style" Kanji representations are matched to modern Kanji representations. Different variations of Kanji are matched. Half-width and Full-width Katakana are matched. Katakana, Hiragana, and Romaji are matched. Some Kanji family names are matched with their Katakana, Hiragana, and Romaji representations.

 

Organization
Description The Organization match definition generates match codes which can be used to cluster records containing organization names.
Max Length of Match Code 35 characters
  Input Cluster ID
Example 1

Sensitivities
95-100
タナカ鉄工株式会社 1
タナカ鉄工株式会社 1
ソニー株式会社大阪第1工場 2
ソニー株式会社大阪第2工場 3
㈱ソニー北九州工場 4
  Input Cluster ID
Example 2

Sensitivities
90-94
ソニー株式会社大阪第1工場 1
ソニー株式会社大阪第2工場 1
㈱ソニー北九州工場 2
國學院大學 3
国学院大学 4
株式会社くぼた 5
株式会社クボタ 6
Remarks For sensitivities 90-100, Organization name and site information are evaluated. Half-width and full-width Katakana are matched. Company legal forms are ignored.
  Input Cluster ID
Example 3


Sensitivities
85-89
ソニー株式会社大阪第1工場 1
ソニー株式会社大阪第2工場 1
㈱ソニー北九州工場 1
國學院大學 2
国学院大学 2
株式会社くぼた 3
株式会社クボタ 3
インシュアランスとうきょう 4
インシュアランスとうきゅう 5
  Input Cluster ID
Example 4

Sensitivities
80-84
インシュアランスとうきょう 1
インシュアランスとうきゅう 1
エービーシープライム生命保険 2
エービーシープライム損害保険 3
  Input Cluster ID
Example 5

Sensitivities
75-79
エービーシープライム生命保険 1
エービーシープライム損害保険 1
KOGAファイナンス 2
KOGAファイナンシング 3
  Input Cluster ID
Example 6

Sensitivities
70-74
KOGAファイナンス 1
KOGAファイナンシング 1
わたなべ金属加工株式会社 2
わたなべ金属鍛造株式会社 3
  Input Cluster ID
Example 7

Sensitivities
65-69
わたなべ金属加工株式会社 1
わたなべ金属鍛造株式会社 1
(株)テレビキャスト 2
(株)テレビキョーワ 3
  Input Cluster ID
Example 8

Sensitivities
60-64
(株)テレビキャスト 1
(株)テレビキョーワ 1
トヨタ自動車 2
トヨタホーム 3
  Input Cluster ID
Example 9

Sensitivities
55-59
トヨタ自動車 1
トヨタホーム 1
パナソニック 2
パナホーム 3
  Input Cluster ID
Example 10

Sensitivities
50-54
パナソニック 1
パナホーム 1
Remarks For sensitivities 50-89, Organization name is evaluated. Half-width and full-width Katakana are matched. Company legal forms are ignored. Old style Kanji and modern Kanji are matched. Katakana and Hiragana are matched.

 

Phone
Description The Phone match definition generates match codes which can be used to cluster records containing phone numbers.
Max Length of Match Code 22 characters
  Input Cluster ID
Example 1

Sensitivities
95-100
(03)1234-5678 1
(03)1234-5678 1
直通(+81)3-12345678 1
0521234567 ext.123 2
0521234567 ext.124 3
0521234567 4
1234567 5
0591234567 6
0541231234 7
0541231235 8
Remarks  
  Input Cluster ID
Example 2

Sensitivities
90-95
(03)1234-5678 1
(03)1234-5678 1
直通(+81)3-12345678 1
0521234567 ext.123 2
0521234567 ext.124 2
0521234567 3
1234567 4
0591234567 5
0541231234 6
0541231235 7
Remarks  
  Input Cluster ID
Example 3

Sensitivities
85-89
(03)1234-5678 1
(03)1234-5678 1
直通(+81)3-12345678 1
0521234567 ext.123 2
0521234567 ext.124 2
0521234567 2
1234567 3
0591234567 4
0541231234 5
0541231235 5
Remarks  
  Input Cluster ID
Example 4

Sensitivities
70-84
(03)1234-5678 1
(03)1234-5678 1
直通(+81)3-12345678 1
0521234567 ext.123 2
0521234567 ext.124 2
0521234567 2
1234567 3
0591234567 4
0541231234 5
Remarks  
  Input Cluster ID
Example 5

Sensitivities
65-69
(03)1234-5678 1
(03)1234-5678 1
直通(+81)3-12345678 1
0521234567 ext.123 2
0521234567 ext.124 2
0521234567 2
0591234567 2
0541231234 3
1234567 4
Remarks  
  Input Cluster ID
Example 6

Sensitivities
50-64
(03)1234-5678 1
(03)1234-5678 1
直通(+81)3-12345678 1
(04)12345678 ext.123 1
(04)12345678 ext.124 1
(04)12345678 1
12345678 1
(06)12345678 1
(06)12349999 1
Remarks Note that the number of digits retained in the match code for the Country Code, Area Code, and Extension tokens depends on the sensitivity level. The number of digits retained in the match code for the Base Number depends on the sensitivity level and the number of digits in the Base Number input.

 

Postal Code
Description The Postal Code match definition generates match codes which can be used to cluster records containing postal codes.
Max Length of Match Code 15 characters
  Input Cluster ID
Example 1

Sensitivities
85-100
104-0052 1
〒104-0052 1
郵便番号一〇四の〇〇五二 1
1040053 2
1040054 3
1040065 4
1040123 5
1050123 6
Remarks Primary and secondary postal codes are evaluated.
  Input Cluster ID
Example 2

Sensitivities
80-84
104-0052 1
〒104-0052 1
郵便番号一〇四の〇〇五二 1
1040053 1
1040054 1
1040065 2
1040123 3
1050123 4
Remarks The last digit in secondary postal code is ignored.
  Input Cluster ID
Example 3

Sensitivities
70-79
104-0052 1
〒104-0052 1
郵便番号一〇四の〇〇五二 1
1040053 1
1040054 1
1040065 1
1040123 2
1050123 3
Remarks The last two digits in secondary postal code are ignored.
  Input Cluster ID
Example 4

Sensitivities
50-69
104-0052 1
〒104-0052 1
郵便番号一〇四の〇〇五二 1
1040053 1
1040054 1
1040065 1
1040123 1
1050123 2
Remarks Only primary postal code is evaluated.

 

Prefecture
Description The Prefecture match definition generates match codes which can be used to cluster records containing prefecture names.
Max Length of Match Code 15 characters
  Input Cluster ID
Example 1

Sensitivities
90-100
愛知県 1
あいちけん 1
アイチケン 1
アイチ 1
カナガワケン 2
カナガクケン 3
カナカワケン 4
Remarks Kanji and kana are matched.
  Input Cluster ID
Example 2

Sensitivities
80-89
愛知県 1
あいちけん 1
アイチケン 1
アイチ 1
カナガワケン 2
カナガクケン 2
カナカワケン 3
Remarks Kanji and kana are matched. Note that fewer characters are considered as the sensitivity decreases.
  Input Cluster ID
Example 3

Sensitivities
50-79
愛知県 1
あいちけん 1
アイチケン 1
アイチ 1
カナガワケン 2
カナガクケン 2
カナカワケン 2
Remarks Kanji and kana are matched. Note that fewer characters are considered as the sensitivity decreases.

Parse Definitions

Address
Description The Parse definition for Address parses address information.
Output Tokens Block
Building
Floor
Room
PO Box
Additional Info
  Input Output
Example 1 1-13-1 イヌイビル・カチドキ8F Block 1-13-1
Building イヌイビル・カチドキ
Floor 8F
Room  
PO Box  
Additional Info  
  Input Output
Example 2 二丁目五番八十二号山王パークビルB館5階502 (総務部内) Block 二丁目五番八十二号
Building 山王パークビルB館
Floor 5階
Room 502
PO Box  
Additional Info (総務部内)
  Input Output
Example 3 新東京郵便局 私書箱123 Block  
Building  
Floor  
Room  
PO Box 新東京郵便局 私書箱123
Additional Info  
Remarks The Parse definition for Address recognizes both ASCII and Kanji numerals.

 

Address (Full)
Description The Parse definition for Address (Full) parses full two-line addresses.
Output Tokens Postal Code
Prefecture
City
Town
Block
Building
Floor
Room
PO Box
Additional Info
  Input Output
Example 1 104-0054 東京都中央区勝どき1-13-1 イヌイビルカチドキ8F 801号室 (開発部内) Postal Code 104-0054
Prefecture 東京都
City 中央区
Town 勝どき
Block 1-13-1
Building イヌイビルカチドキ
Floor 8F
Room 801号室
PO Box  
Additional Info 開発部内
  Input Output
Example 2 三四五-六七八九 新潟南魚沼郡湯沢町神立二丁目五番八十二号山王パークビルB館5階502 (総務部内) Postal Code 三四五-六七八九
Prefecture 新潟
City 南魚沼郡湯沢町
Town 神立
Block 二丁目五番八十二号
Building 山王パークビルB館
Floor 5階
Room 502
PO Box  
Additional Info 総務部内
  Input Output
Example 3 〒163-8799 東京都新宿区 新宿郵便局 私書箱第123号 Postal Code 〒163-8799
Prefecture 東京都
City 新宿区
Town  
Block  
Building  
Floor  
Room  
PO Box 新宿郵便局 私書箱第123号
Additional Info  
Remarks Prefecture and city names are recognized with or without an indicator keyword. Both ASCII and Kanji numerals are recognized.

 

Address (Global)
Description

The Address (Global) parse definition parses addresses into a globally recognized set of tokens.

Output Tokens Recipient
Building/Site
Street
Extension
PO Box
Additional Info
  Input Output
Example 1 1-13-1 イヌイビル・カチドキ8F Recipient  
Building/Site イヌイビル・カチドキ
Street 1-13-1
Extension 8F
PO Box  
Additional Info  
  Input Output
Example 2 二丁目五番八十二号山王パークビルB館5階502 (総務部内) Recipient  
Building/Site 山王パークビルB館
Street 二丁目五番八十二号
Extension 5階 502
PO Box  
Additional Info (総務部内)
  Input Output
Example 3 新東京郵便局 私書箱123 Recipient  
Building/Site  
Street  
Extension  
PO Box 新東京郵便局 私書箱123
Additional Info  
Remarks Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales.

The Address (Global) (v23) parse definition is now deprecated and will be removed in a future release of the QKB.

The Address (Global) parse definition has been replaced with a copy of the Address (Global) (v23) definition which takes advantage of the new tokens and updated processing. If you changed your jobs to use Address (Global) (v23) it is suggested that you change them back.

 

Address (Global) (v23)
Description

The Address (Global) (v23) parse definition parses addresses into a globally recognized set of tokens.

Output Tokens Recipient
Building/Site
Street
Extension
PO Box
Additional Info
  Input Output
Example 1 1-13-1 イヌイビル・カチドキ8F Recipient  
Building/Site イヌイビル・カチドキ
Street 1-13-1
Extension 8F
PO Box  
Additional Info  
  Input Output
Example 2 二丁目五番八十二号山王パークビルB館5階502 (総務部内) Recipient  
Building/Site 山王パークビルB館
Street 二丁目五番八十二号
Extension 5階 502
PO Box  
Additional Info (総務部内)
  Input Output
Example 3 新東京郵便局 私書箱123 Recipient  
Building/Site  
Street  
Extension  
PO Box 新東京郵便局 私書箱123
Additional Info  
Remarks Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales.

The Address (Global) (v23) parse definition is now deprecated and will be removed in a future release of the QKB.

The Address (Global) parse definition has been replaced with a copy of the Address (Global) (v23) definition which takes advantage of the new tokens and updated processing. If you changed your jobs to use Address (Global) (v23) it is suggested that you change them back.

 

City
Description The Parse definition for City parses city and town names.
Output Tokens City
Town
  Input Output
Example 1 横浜市神奈川区白幡東町 City 横浜市神奈川区
Town 白幡東町
  Input Output
Example 2 チュウオウクカチドキ City チュウオウク
Town カチドキ
  Input Output
Example 3 名古屋千種今池南 City 名古屋千種
Town 今池南
Remarks This definition recognizes Kanji, Hiragana, Full-width Katakana, and Half-width Katakana. Recognizes city names with or without identifier keywords ("市", "区", and so on).

 

City - State/Province - Postal Code
Description The Parse definition for City - State/Province - Postal Code parses address "last line" data, which typically includes postal code, prefecture, and city information.
Output Tokens Postal Code
Prefecture
City
Town
  Input Output
Example 1 104-0054 東京都中央区勝どき Postal Code 104-0054
Prefecture 東京都
City 中央区
Town 勝どき
  Input Output
Example 2 〒1040054 トウキョウトチュウオウクカチドキ Postal Code 〒1040054
Prefecture トウキョウト
City チュウオウク
Town カチドキ
Remarks  

 

City - State/Province - Postal Code (Global)
Description The Parse definition for City - State/Province - Postal Code (Global) parses address "last line" data into a globally recognized set of tokens.
Output Tokens City
State/Province
Postal Code
Additional Info
  Input Output
Example 1 3280011 栃木県栃木市大宮町 City 栃木市大宮町
State/Province 栃木県
Postal Code 3280011
Additional Info  
  Input Output
Example 2 〒三二八ー〇〇一一栃木県栃木市大宮町 City 栃木市大宮町
State/Province 栃木県
Postal Code 〒三二八ー〇〇一一
Additional Info  
Remarks Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales.

 

Name
Description The Parse definition for Name parses names of individuals.
Output Tokens Family Name
Given Name
Name Suffix
Title/Additional Info
  Input Output
Example 1 鈴木一郎様 Family Name 鈴木
Given Name 一郎
Name Suffix
Title/Additional Info  
  Input Output
Example 2 医学博士すずきいちろう殿 Family Name すずき
Given Name いちろう
Name Suffix 殿
Title/Additional Info 医学博士
  Input Output
Example 3 営業課長スズキイチロウ (一級建築士) Family Name スズキ
Given Name イチロウ
Name Suffix  
Title/Additional Info 営業課長 (一級建築士)
Remarks  

 

Name (Global)
Description The Parse definition for Name (Global) parses names of individuals into a globally recognized set of tokens.
Output Tokens Prefix
Given Name
Middle Name
Family Name
Suffix
Title/Additional Info
  Input Output
Example 1 鈴木一郎様 Prefix  
Given Name 一郎
Middle Name  
Family Name 鈴木
Suffix
Title/Additional Info  
  Input Output
Example 2 医学博士すずきいちろう殿 Prefix  
Given Name いちろう
Middle Name  
Family Name すずき
Suffix 殿
Title/Additional Info 医学博士
  Input Output
Example 3 営業課長スズキイチロウ (一級建築士) Prefix  
Given Name イチロウ
Middle Name  
Family Name スズキ
Suffix  
Title/Additional Info 営業課長 (一級建築士)
Remarks Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales.

 

Organization
Description The Parse definition for Organization parses organization names.
Output Tokens Name
Legal Form
Site
Additional Info
  Input Output
Example 1 株式会社SASジャパン東京本社 開発部 Name SASジャパン
Legal Form 株式会社
Site 東京本社
Additional Info 開発部
  Input Output
Example 2 山田製造(株)西大阪支社 (カスタマーサポート係) Name 山田製造
Legal Form (株)
Site 西大阪支社
Additional Info (カスタマーサポート係)
Remarks  

 

Organization (Global)
Description The Parse definition for Organization (Global) parses organization names into a globally recognized set of tokens.
Output Tokens Name
Legal Form
Site
Additional Info
  Input Output
Example 1 株式会社SASジャパン東京本社 開発部 Name SASジャパン
Legal Form 株式会社
Site 東京本社
Additional Info 開発部
  Input Output
Example 2 山田製造(株)西大阪支社 (カスタマーサポート係) Name 山田製造
Legal Form (株)
Site 西大阪支社
Additional Info (カスタマーサポート係)
Remarks Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales.

 

Phone
Description The Parse definition for Phone parses phone numbers into a set of tokens.
Output Tokens Country Code
Area Code
Base Number
Extension
Line Type
Additional Info
  Input Output
Example 1 Tel(+81) 03-1234-5678 ext.123 (evenings and weekends) Country Code (+81)
Area Code 03
Base Number 1234-5678
Extension 123
Line Type Tel
Additional Info (evenings and weekends)
  Input Output
Example 2 携帯 03ー1234ー5678 Country Code  
Area Code 03
Base Number 1234ー5678
Extension  
Line Type 携帯
Additional Info  
  Input Output
Example 3 0521234567123 Country Code  
Area Code 052
Base Number 1234567
Extension 123
Line Type  
Additional Info  
Remarks  

 

Phone (Global)
Description The Parse definition for Phone (Global) parses phone numbers into a globally recognized set of tokens.
Output Tokens Country Code
Area Code
Base Number
Extension
Line Type
Additional Info
  Input Output
Example 1 Tel(+81) 03-1234-5678 ext.123 (evenings and weekends) Country Code (+81)
Area Code 03
Base Number 1234-5678
Extension 123
Line Type Tel
Additional Info (evenings and weekends)
  Input Output
Example 2 携帯 03ー1234ー5678 Country Code  
Area Code 03
Base Number 1234ー5678
Extension  
Line Type 携帯
Additional Info  
  Input Output
Example 3 0521234567123 Country Code  
Area Code 052
Base Number 1234567
Extension 123
Line Type  
Additional Info  
Remarks Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales.

Pattern Analysis Definitions

None.

Standardization Definitions

Address
Description The Standardization definition for Address standardizes address information.
  Input Output
Example 1 1ー13ー1 イヌイビル・カチドキ8F 501号室 1-13-1 イヌイビル・カチドキ 8F 501
  Input Output
Example 2 "二丁目五番八十二号山王パークビルB館5階502(総務部内)" 2-5-82 山王パークビルB館 5F 502 総務部内
Remarks Numeric expressions are standardized. Full-width alphanumeric characters are converted to half-width characters. Building names written in kana are not transliterated (except that half-width Katakana is converted to full-width Katakana).

 

Address (Full)
Description The Standardization definition for Address (Full) standardizes full two-line addresses.
  Input Output
Examples 1040052 トウキョウ都中央区勝どき1-13-1 イヌイビルカチドキ8F 801号室 (開発部内) 104-0054 東京都中央区勝どき 1-13-1 イヌイビルカチドキ 8F 801 開発部内
"三四五-六七八九 新潟南魚沼郡湯沢町神立二丁目五番八十二号山王パークビルB館5階502 (総務部内) " 345-6789 新潟南魚沼郡湯沢町神立 2-5-82 山王パークビルB館 5F 502 総務部内
〒163-8799 東京都新宿区 新宿郵便局 私書箱第123号 163-8799 東京都新宿区 郵便事業株式会社 新宿支店 私書箱123号
Remarks Numeric expressions are standardized. Prefecture and city names are converted from kana to kanji. Full-width alphanumeric characters are converted to half-width characters. Prefecture and city identifier keywords are added when possible. Non-logical characters are removed: quotes, blanks, and so on. Arabic numerals in city and town names are converted to kanji (for example, 三番町). Kanji numerals in block numbers are converted to Arabic numerals (for example, 4-3-5). Building names written in kana are not transliterated (except that half-width katakana is converted to full-width katakana).

 

City
Description The Standardization definition for City standardizes city and town names.
  Input Output
Examples 名古屋千種今池南 名古屋市千種区今池南
チュウオウクカチドキ 中央区勝どき
"札幌市 中央区 北1条東" 札幌市中央区北一条東
Remarks Adds city identifier keywords when possible. Converts kana names to kanji. Removes non-logical characters: quotes, blanks, and so on. Makes a best guess of Kanji or Arabic numbers in city and town names depending on the context (for example, 一条, 三番町, 1区).

 

City - State/Province - Postal Code
Description The Standardization definition for City - State/Province - Postal Code standardizes Postal Code, Prefecture, and City.
  Input Output
Example 1 〒1051234 "東京港浜松町" 105-1234 東京都港区浜松町
Remarks Add Identifier when possible. Remove non-logical characters: quotes, blanks, and so on.
  Input Output
Example 2 一〇四ノ〇〇五四 トウキョウチュウオウクカチドキ 104-0054 東京都中央区勝どき
Remarks This definition converts kanji numerals to Arabic numerals. Convert prefecture and city names from kana to kanji.

 

Date (Japanese Calendar)
Description The Standardization definition for Date (Japanese Calendar) standardizes date expressions to Japanese calendar format (Nengo).
  Input Output Explanation
Examples 2001/3/14 平成13年03月14日 Standardize fullwidth characters to halfwidth
2001年3月14日 平成13年03月14日 Standardize calendar identifier to EEYY年MM月DD日 format.
3/14/2001 平成13年03月14日 Consider 4-digit number as year
14-mar-01 平成13年03月14日 Standardize month name. When the day and year are ambiguous, consider the last number to be the year.
H13.3.14 平成13年03月14日 Standardize Nengo expression.
平成十三年三月十四日 平成13年03月14日 Standardize Kanji numbers.
大正十五年一二月二五日 昭和元年12月25日 When a day is assigned to two nengo, the later of the two nengo is used. For example, 1926/12/25 is in 昭和元年, 1912/07/30 is in 大正元年.
S01年12月25日 昭和元年12月25日 01年 is expressed as 元年.
Remarks This definition supports dates from 1901 (明治34年) to 2050 (平成62年). Assumes two-digit years 00-29 are 2000-2029. Assumes two-digit years 30-99 are 1930-1999.

 

Date (Western Calendar)
Description The Standardization definition for Date (Western Calendar) standardizes date expressions to Western calendar format.
  Input Output Explanation
Examples 2001/3/14 2001/03/14 Standardize Fullwidth characters to Halfwidth.
2001年3月14日 2001/03/14 Standardize calendar identifier to YYYY/MM/DD format.
3/14/2001 2001/03/14 Consider 4-digit number as year.
14-mar-01 2001/03/14 Standardize month name to digit. When the day and year are ambiguous, consider the last number to be the year.
H13.3.14 2001/03/14 Convert Nengo year to western year.
平成13年3月14日 2001/03/14 Convert Nengo year to western year.
平成十三年三月十四日 2001/03/14 Standardize Kanji numbers.
平成元年三月十四日 1989/03/14  
Remarks This definition supports dates from 1901 (明治34年) to 2050 (平成62年). The definition assumes two-digit years 00-29 are 2000-2029. The definition assumes two-digit years 30-99 are 1930-1999.

 

Name
Description The Standardization definition for Name standardizes names of individuals.
  Input Output
Examples 鈴木一郎様 鈴木 一郎
"営業課長鈴木 一郎様 (一級建築士)" 鈴木 一郎, 営業課長 一級建築士
すずきいちろう すずき いちろう
Remarks Name suffixes are discarded. A single white space is placed between the family name and the given name. Titles and additional information are placed after name. Titles are converted to Kanji. Names written in Katakana are left in Katakana (they are not transliterated to Kanji). Similarly, names written in Hiragana are left in Hiragana.

 

Organization
Description The Standardization definition for Organization standardizes organization names.
  Input Output
Examples (株)SASジャパン東京本社 開発部 SASジャパン 株式会社, 東京本社 開発部
"ツキシマクミンセンター" 月島区民センター
Remarks Half-width Katakana is transformed to full-with Katakana. Full-width ASCII characters are transformed to half-width. Legal form information appearing at the beginning of a string is moved to the end of the organization name. Legal forms are converted to long-form. Kana forms of well-known company names are converted to Kanji.

 

Phone
Description The Standardization definition for Phone standardizes phone numbers for domestic use.
  Input Output
Examples 電話 (03)1234-5678 内線123 (03) 1234 5678 x123, Tel
"81312345678" (03) 1234 5678
+1 (919) 447-3000 +1 9194473000
Remarks  

 

Phone (Electronic)
Description The Standardization definition for Phone (Electronic) standardizes phone numbers for automated calling systems
  Input Output
Examples 電話 (03)1234-5678 内線123 +81312345678
+1-800-HOLIDAY +18004654329
+1-800-DATAFLUX +180032823589
0120 VISITJAPAN +811208474852726
Remarks  

 

Phone (with Country Code)
Description The Standardization definition for Phone (with Country Code) standardizes phone numbers for international use.
  Input Output
Examples 電話 (03)1234-5678 内線123 +81 3 1234 5678 x123, Tel
81312345678 (after 4pm) +81 3 1234 5678, After 4PM
0044 (0)20 12345000 +44 2012345000
Remarks  

 

Postal Code
Description The Standardization definition for Postal Code standardizes postal codes.
  Input Output
Examples 〒1040052 104-0052
〶一〇四の〇〇五二 104-0052
"907-11" 907-11
Remarks The Standardization definition for Postal Code inserts a hyphen after the first three digits. Converts kanji numerals to Arabic numerals. Converts full-width ASCII to half-width. Removes postal code marker characters.

 

Prefecture
Description The Standardization definition for Prefecture standardizes prefecture names.
  Input Output
Examples 愛知 愛知県
あいちけん 愛知県
"アイチ" 愛知県
Remarks This definition adds prefecture identifier keywords when possible. Converts names from kanji to kana. Removes non-logical characters: quotes, blanks, and so on.

Inherited Definitions

In addition to the definitions listed on this page, the Japanese, Japan locale also inherits all definitions for the Japanese language and all Global definitions.