You are here: Customizing Quality Knowledge Bases>Customize - Vocabulary Editor>Building Vocabularies

DataFlux Data Management Studio 2.5: User Guide

Customize - Vocabulary Editor - Building Vocabularies

You can use the Customize Vocabulary Editor to build Vocabularies. We recommend you build one Vocabulary for each parse definition.

  1. Open the Vocabulary Editor. On the Customize dialog, choose Tools > Vocabulary Editor.
  2. Create a New Vocabulary. On the Vocabulary Editor dialog, choose File > New, and then specify a locale in the Select Locale(s) dialog.
  3. Import Categories From a Grammar. You import categories and likelihoods so that you can associate those values with the words that you add to your vocabulary. Each word must be associated with one or more categories from the Grammar.

    To import categories, choose Options > Categories. The Categories dialog appears. Click Import to display the Select Grammar dialog. Select the Grammar that you want to associate with your new Vocabulary, and then click OK. On the Categories screen, the Grammar's standard category abbreviations and descriptions appear. (Derived categories are not imported.) Use the Delete button to delete any unwanted categories, and then click Close.
  4. Import Words Into the Vocabulary. After you import categories, you import the words that fit those categories. Choose File > Import to display the Import Words dialog. Use the Type field to import words from a text file, a vocabulary, or a scheme. Click Select QKB to Import From to import a vocabulary or scheme from a Quality Knowledge Base other than your current QKB.

    To add categories to imported words in the Import Words dialog, click Add under the heading Add these categories. For each category that you select, you can select Overwrite likelihood. Choosing this option adds your specified likelihood in place of a different value that may already exist for that word and category in your new vocabulary.

    Note Note: The remainder of this step applies only to the import of words from Vocabularies, when you select Vocabulary for the Type field.

    When you import words from a Vocabulary, then you can also filter the words that will be imported. Click Filter Word List to display the Import Filter dialog. Use the Import Filter dialog to import only those words that have been assigned all of your selected categories.

    In the Import Words dialog, to import categories as well as words from a vocabulary file, select the check box Use categories from imported vocabulary words. If an imported word already exists in your new Vocabulary, then any new categories are added to the existing word.

    If an imported word already exists, and if both words share categories, and if likelihood values differ, you need to decide how to resolve the likelihood conflict. In the Import Words dialog, select a button under the heading In case of likelihood conflict during merge. You can choose to import all likelihood values (overwrite local likelihood), keep your existing likelihood values (refuse imported likelihood), or receive a prompt to decide each conflict individually.
  5. Build the Vocabulary. After you specify how you want to import words and categories from a particular file, click Import in the Import Words dialog. The new words appear in the Vocabulary Editor dialog. To add more words from another file, select another file in the Import Words dialog. You can also add words individually by choosing Edit > Add Word.

    When the import is complete, click Close to return to the Vocabulary Editor dialog.
  6. Modify Categories in the Vocabulary. Now that you are looking at the imported words and categories in your new Vocabulary, you can add or delete individual categories, or change likelihood values.

    Here is an example of when you might want to update a likelihood value. If your vocabulary contains the name Scott, then that word might have the categories Family Name Word (FNW) and Given Name Word (GNW). You might determine that Scott is more likely to be a Given Name Word than a Family Name Word. You could then increase the likelihood value of the Given Name category for the word Scott.

    To change a likelihood value, select the word and click Edit.

    Note Note: When you add a category to one or more selected words, you receive the Overwrite Category dialog when that that word and category are present in one or more words in your Vocabulary. To resolve likelihood conflicts, click Yes to accept the change in likelihood for the current category. Click Yes to All to accept all remaining likelihood changes. Click No to refuse all overwrites and keep your existing likelihood values. Click Cancel to not add the category.

    Although there may be some adjustments that you want to make to the likelihoods at this point, later testing with the Parse Test Tool will probably reveal other necessary adjustments to give the desired result.
  7. Save the Vocabulary. Now that your Vocabulary is built, you need to save it. Select File > Save. If this is a newly built Vocabulary, the Vocabulary Editor will prompt you for a name.

Related Topics

Documentation Feedback: yourturn@sas.com
Note: Always include the Doc ID when providing documentation feedback.

Doc ID: dfU_Cstm_Vocab_14001.html