Parse Data

The Text Parsing node enables you to parse a document collection in order to quantify information about the terms that are contained therein. You can use the Text Parsing node with volumes of textual data such as e-mail messages, news articles, Web pages, research papers, and surveys. For more information about the Text Parsing node, see the SAS Text Miner Help.
Perform the following steps to add a Text Parsing node to the analysis:
  1. Select the Text Mining tab on the node toolbar, and drag a Text Parsing node into the diagram workspace.
  2. Connect the Data Partition node to the Text Parsing node.
    Process flow diagram
  3. Select the Text Parsing node.
    The properties for the Text Parsing node appear in the Properties Panel.
  4. Set the Different Parts of Speech property value to No.
    For the VAERS data, this setting offers a more compact set of terms.
  5. Click the ellipses icon for the Synonyms property.
    A dialog box appears.
  6. Click Import.
    The Select a SAS Table dialog box appears.
  7. Select No data set to be specified.
  8. Click OK to exit the Select a SAS Table dialog box.
  9. Click OK to exit the Synonyms dialog box.
  10. Click the ellipses icon for the Ignore Parts of Speech property.
    The Ignore Parts of Speech dialog box appears.
  11. Select the following items, which represent parts of speech:
    • Aux
    • Conj
    • Det
    • Interj
    • Part
    • Prep
    • Pron
    • Num
    Note: Hold down the CTRL key to select more than one.
    Any terms with the parts of speech that you select in the Ignore Parts of Speech dialog box are ignored during parsing. The selections indicated here ensure that the analysis ignores low-content words such as prepositions and determiners.
  12. Click OK.