Filter Data

The Text Filter node can be used to reduce the total number of parsed terms or documents that are analyzed. Therefore, you can eliminate extraneous information so that only the most valuable and relevant information is considered. For example, the Text Filter node can be used to remove unwanted terms and to keep only documents that discuss a particular issue. This reduced data set can be orders of magnitude smaller than the one that represents the original collection, which might contain hundreds of thousands of documents and hundreds of thousands of distinct terms. For more information about the Text Filter node, see the SAS Text Miner help.
To filter the data:
  1. Select the Text Mining tab on the node toolbar, and drag a Text Filter node into the diagram workspace.
  2. Connect the Text Parsing node to the Text Filter node.
    Process flow diagram
  3. Select the Text Filter node.
  4. Set the value of the Term Weight property to Mutual Information.
    This causes the terms to be differentially weighted when they correspond to serious reactions.