About the Tasks That You Will Perform

Long before text mining, researchers have needed to analyze text. In the field of drug trials, the need was acute enough that coding systems were developed to automatically pull out keywords or synonyms of keywords that could then be analyzed to understand adverse events. The COSTART coding system was one such attempt. COSTART terms consist of one to three tokens: a symptom, an optional body part, and an optional subpart. One initial task is to find what factors influence whether a reaction becomes serious and how well these factors are captured by the COSTART terms. One way of doing this is to use SAS Text Miner to see how well the COSTART terms predict the seriousness of the adverse event. This chapter explores an example of predictive modeling in SAS Text Miner.
To analyze texts with predictive models, you will perform the following tasks:
  1. Use the COSTRING variable and the Decision Tree node to create a model.
  2. Use the SYMPTOM_TEXT variable and the Decision Tree node to create a model.
  3. Compare the models using the Model Comparison node.