SAS Quality Knowledge Base for Contact Information 26
In the SAS Quality Knowledge Base, the French definitions are shared by all French-language locales. Shared French definitions are described below.
Case Definitions
Extraction Definitions
Gender Analysis Definitions
Identification Analysis Definitions
Match Definitions
Parse Definitions
Pattern Analysis Definitions
Standardization Definitions
Inherited Definitions
Proper (Name) | ||
---|---|---|
Description | The Proper (Name) case definition propercases names of individuals. | |
Input | Output | |
Examples | MONSIEUR JEAN-JACQUES LANVIN | Monsieur Jean-Jacques Lanvin |
mme de la tour | Mme de la Tour | |
JAN VAN DEN BRINK | Jan van den Brink | |
Remarks | Prepositions are lowercased. |
Proper (Organization) | ||
---|---|---|
Description | The Proper (Organization) case definition propercases organization names. | |
Input | Output | |
Examples | Bnp Paribas | BNP Paribas |
La maison de lucy | La Maison de Lucy | |
SNCF | SNCF | |
Remarks | Acronyms are uppercased, while prepositions are lowercased. |
Upper (Legal Form) | ||
---|---|---|
Description | The Upper (Legal Form) case definition uppercases organization legal forms. | |
Input | Output | |
Examples | cie | CIE |
ei | EI | |
gmbh | GmbH | |
SPA | SpA | |
Remarks | The Upper (Legal Form) case definition is designed for use on the Legal Form token only. Legal forms are generally uppercased, but with exceptions like GmbH and SpA. |
Upper (Organization) | ||
---|---|---|
Description | The Upper (Organization) case definition uppercases Latin characters found in organization names. | |
Input | Output | |
Examples | Bnp Paribas | BNP PARIBAS |
La Maison de Lucy | LA MAISON DE LUCY | |
sncf | SNCF | |
radin.com | radin.com | |
Remarks | Organization names are uppercased, unless they are web site names. |
None.
Name | ||
---|---|---|
Description | The Name gender analysis definition determines the gender of a name. | |
Possible Outputs | M F U |
|
Input | Output | |
Examples | Mme Anne Chaumont | F |
Marc Daverat | M | |
Dr Weissmann | U | |
Jean-Marie Weber | M | |
Marie-Jean Weber | F | |
Remarks |
Field Name | ||
---|---|---|
Description |
The Field Name identification analysis definition identifies database column names. |
|
Possible Outputs | NAME ORGANIZATION ADDRESS CITY STATE/PROVINCE POSTALCODE COUNTRY PHONE DATE UNKNOWN URL GENDER MATCHCODE PERSONAL_ID ORGANIZATION_ID GENERIC_ID COUNTY MARITAL_STATUS |
|
Input | Output | |
Examples | Company Name | ORGANIZATION |
NOM D'ENTREPRISE | ORGANIZATION | |
Address | ADDRESS | |
boite aux lettres | ADDRESS | |
Telephone | PHONE | |
Numero Telephone Mobile | PHONE | |
Remarks |
This definition is recommended to determine the type of data stored in a database column based on the name of the column. |
Name/Organization | ||
---|---|---|
Description | The Name/Organization identification analysis definition determines whether a string represents the name of an individual or an organization. | |
Possible Outputs | NAME ORGANIZATION NAME/ORGANIZATION UNKNOWN |
|
Input | Output | |
Examples | Monsieur Henri Dumont | NAME |
Henri Dumont SA | ORGANIZATION | |
SNCF | ORGANIZATION | |
Monsieur Rejany | NAME | |
Monsieur Rejany, SOLVEO | NAME/ORGANIZATION | |
Rejany | UNKNOWN | |
Remarks |
Field Name | ||
---|---|---|
Description | The Field Name match definition generates match codes which can be used to cluster records containing database field names. | |
Max Length of Match Code | 15 characters | |
Input | Cluster ID | |
Examples | Company Name | 0 |
NOM D'ENTREPRISE | 0 | |
boite aux lettres | 1 | |
Address | 1 | |
Telephone | 2 | |
Numero Telephone Mobile | 2 | |
Remarks |
This definition should be used to find potential matches between database column names.
|
Name | ||
---|---|---|
Description | The Name match definition generates match codes which can be used to cluster records containing names of individuals. | |
Max Length of Match Code | 26 characters | |
Input | Cluster ID | |
Examples | Anne Christin Chaumont | 0 |
Anne Charlotte Chaumont | 0 | |
Docteur Anne Christin Chaumont | 0 | |
Remarks |
|
Organization | ||
---|---|---|
Description | The Organization match definition generates match codes which can be used to cluster records containing organization names. | |
Max Length of Match Code | 35 characters | |
Input | Cluster ID | |
Examples | La banque postale | 0 |
Banque postale, la | 0 | |
La lyonnaise de banque | 1 | |
Remarks |
|
Name | |||
---|---|---|---|
Description | The Name parse definition parses names of individuals into a set of tokens. | ||
Output Tokens | Prefix Given Name Family Name Suffix Title/Additional Info |
||
Input | Output Token | Output | |
Example 1 | Monsieur Jean Pierre Lacombe | Prefix | Monsieur |
Given Name | Jean Pierre | ||
Family Name | Lacombe | ||
Suffix | |||
Title/Additional Info | |||
Input | Output Token | Output | |
Example 2 | M J Pierre Lacombe Jr, PDG | Prefix | M |
Given Name | J Pierre | ||
Family Name | Lacombe | ||
Suffix | Jr | ||
Title/Additional Info | PDG | ||
Input | Output Token | Output | |
Example 3 | Mlle Floriane de la Tour, Directrice Ressources Humaines | Prefix | Mlle |
Given Name | Floriane | ||
Family Name | de la Tour | ||
Suffix | |||
Title/Additional Info | Directrice Ressources Humaines | ||
Remarks |
Name (Global) | |||
---|---|---|---|
Description | The Name (Global) parse definition parses names of individuals into a globally recognized set of tokens. | ||
Output Tokens | Prefix Given Name Middle Name Family Name Suffix Title/Additional Info |
||
Input | Output Token | Output | |
Example 1 | Monsieur Jean Pierre Lacombe | Prefix | Monsieur |
Given Name | Jean Pierre | ||
Middle Name | |||
Family Name | Lacombe | ||
Suffix | |||
Title/Additional Info | |||
Input | Output Token | Output | |
Example 2 | M J Pierre Lacombe Jr, PDG | Prefix | M |
Given Name | J Pierre | ||
Middle Name | |||
Family Name | Lacombe | ||
Suffix | Jr | ||
Title/Additional Info | PDG | ||
Input | Output Token | Output | |
Example 3 | Mlle Floriane de la Tour, Directrice Ressources Humaines | Prefix | Mlle |
Given Name | Floriane | ||
Middle Name | |||
Family Name | de la Tour | ||
Suffix | |||
Title/Additional Info | Directrice Ressources Humaines | ||
Remarks | The Middle Name token is not populated in this definition. Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
Name (Multiple Name) | |||
---|---|---|---|
Description | The Name (Multiple Name) parse definition parses strings that contain the names of two individuals into a set of tokens. | ||
Output Tokens | Name 1 Name 2 |
||
Input | Output Token | Output | |
Example 1 | Valérie Brua et Marc Daudin | Name 1 | Valérie Brua |
Name 2 | Marc Daudin | ||
Input | Output Token | Output | |
Example 2 | M et Mme Daudin Marc | Name 1 | M Marc Daudin |
Name 2 | Mme Daudin | ||
Input | Output Token | Output | |
Example 3 | Valérie & Marc Daudin | Name 1 | Valérie Daudin |
Name 2 | Marc Daudin | ||
Remarks |
Name/Organization | |||
---|---|---|---|
Description | The Name/Organization parse definition parses names and organizations into a set of tokens. | ||
Output Tokens | Name Organization |
||
Input | Output Token | Output | |
Example 1 | Monsieur Jean Pierre Lacombe | Name | Monsieur Jean Pierre Lacombe |
Organization | |||
Input | Output Token | Output | |
Example 2 | SNCF | Name | |
Organization | SNCF | ||
Input | Output Token | Output | |
Example 3 | Monsieur Jean Pierre Lacombe, SNCF | Name | Monsieur Jean Pierre Lacombe |
Organization | SNCF | ||
Remarks |
Organization | |||
---|---|---|---|
Description | The Organization parse definition parses organization names into a set of tokens. | ||
Output Tokens | Name Legal Form Site Additional Info |
||
Input | Output Token | Output | |
Example 1 | SOLVEO SA, Lyon | Name | SOLVEO |
Legal Form | SA | ||
Site | Lyon | ||
Additional Info | |||
Input | Output Token | Output | |
Example 2 | SOLVEO, Lyon, Service Informatique | Name | SOLVEO |
Legal Form | |||
Site | Lyon | ||
Additional Info | Service Informatique | ||
Input | Output Token | Output | |
Example 3 | SOLVEO Paris | Name | SOLVEO Paris |
Legal Form | |||
Site | |||
Additional Info | |||
Remarks |
Organization (Global) | |||
---|---|---|---|
Description | The Organization (Global) parse definition parses organization names into a set of tokens. | ||
Output Tokens | Name Legal Form Site Additional Info |
||
Input | Output Token | Output | |
Example 1 | SOLVEO SA, Lyon | Name | SOLVEO |
Legal Form | SA | ||
Site | Lyon | ||
Additional Info | |||
Input | Output Token | Output | |
Example 2 | SOLVEO, Lyon, Service Informatique | Name | SOLVEO |
Legal Form | |||
Site | Lyon | ||
Additional Info | Service Informatique | ||
Input | Output Token | Output | |
Example 3 | SOLVEO Paris | Name | SOLVEO Paris |
Legal Form | |||
Site | |||
Additional Info | |||
Remarks | Parse definitions named with the Global keyword use a set of output tokens that is consistent across every locale. Results obtained from these definitions can be stored in the same database fields as the results obtained from definitions of the same name in other locales. |
None.
Country | ||
---|---|---|
Description | The Country standardization definition standardizes country names. | |
Input | Output | |
Examples | etats-unis | États-Unis |
GÉORGIE | Géorgie | |
Remarks |
Country (ISO 2 Char) | ||
---|---|---|
Description | The Country (ISO 2 Char) standardization definition standardizes country names into the ISO-3166 two-character designation. | |
Input | Output | |
Examples | etats-unis | US |
GÉORGIE | GE | |
Remarks |
Country (ISO 3 Char) | ||
---|---|---|
Description | The Country (ISO 3 Char) standardization definition standardizes country names into the ISO-3166 three-character designation. | |
Input | Output | |
Examples | etats-unis | USA |
GÉORGIE | GEO | |
Remarks |
Name | ||
---|---|---|
Description | The Name standardization definition standardizes names of individuals. | |
Input | Output | |
Examples | Docteur Marc Weissmann Junior | Dr Marc Weissmann, Jr |
Mademoiselle Jacqueline DE LA SALLE, Directrice de centre | Mlle Jacqueline de la Salle, Directrice de Centre | |
Remarks |
Organization | ||
---|---|---|
Description | The Organization standardization definition standardizes organization names. | |
Input | Output | |
Examples | Secam Entreprise Individuelle, alby | Secam EI, Alby |
Kaufman and Broad SA Marseille | Kaufman & Broad SA, Marseille | |
Remarks |
Organization (Upper) | ||
---|---|---|
Description | The Organization (Upper) standardization definition standardizes and uppercases organization names. | |
Input | Output | |
Examples | Secam Entreprise Individuelle, alby | SECAM EI, ALBY |
Kaufman and Broad SA Marseille | KAUFMAN & BROAD SA, MARSEILLE | |
radin.com | radin.com | |
Remarks | The Organization (Upper) standardization definition uppercases all organization names except those that are website names. |
In addition to the definitions listed on this page, all French-language locales also inherit all Global definitions.
Documentation Feedback: yourturn@sas.com |
Doc ID: QKBCI_FR_defs.html |