Using the Text Topic Node

Note: This example assumes you completed the example in Using The Text Filter Node, and builds off the process flow diagram created there.
  1. Select the Text Mining tab in the Enterprise Miner Toolbar, and drag a Text Topic node into the diagram workspace.
  2. Connect the Text Filter node to the Text Topic node.
    Process Flow Diagram
  3. Before running the Text Topic node, you are going to create a topic that identifies abstracts that discuss dynamic Web sites. To do this, click on the ellipsis button next to the User Topics property of the Text Topic node. This opens the User Topics window where you can create your own topics.
    In the User Topics window, there are four columns, labeled _Role_, _Term_, _Weight_, and _Topic_. For this example, you can leave the _Role_ column empty. To add a row, click the Add button at the top of the window. To delete a term, select that row and click the Delete button at the top of the window.
    To create the topic about dynamic Web sites, enter the three terms, with their corresponding weights and topics, as shown in the image below. The weight of a term is relative and can be any value between 0 and 1. More important terms should have a higher weighting than less important terms.
    User Define Topic Window
    After you have entered the three terms shown above, click OK to save your changes. You are now ready to run the Text Topic node.
  4. Right-click the Text Topic node and select Run to run the Text Topic node with all other settings at their default values. Click Yes in the Confirmation dialog box. When the node finishes running, select Results in the Run Status dialog box.
  5. From the Results window, expand the Number of Terms by Topic chart. The default settings created twenty-five multi-term topics in addition to the one you defined above. Close the Results window.
  6. Select the Text Topic node, and then click the ellipsis button for the Topic Viewer property to open the Interactive Topic Viewer window. At the top of the Topics list should be the topic Internet, which you created. Notice that the Category of this topic is given as User while the rest of the topics are given as Multiple.
    You can view all of the documents in your topic by right-clicking anywhere in the first row and selecting Select Current Topic, if it is not already selected. The middle pane shows you that only the terms dynamic, web page, and web site are in this topic. It also shows how many documents each term is in and how frequently each term appeared. The bottom pane contains the observations from the Abstract data set that belong to your topic. You can read each of these by right-clicking anywhere in the bottom pane and selecting Toggle Show Full Text from the menu. Close the Interactive Topic Viewer.
  7. One problem that you might have noticed is that many of the topics appear to be closely related because they have many terms in common. This suggests that some topics can be merged together. In the Topics list of the Interactive Filter Viewer, there are four topics (topics 2, 3, 5, and 10) that contain both user and application as descriptive terms. These topics can be merged to form one, user-created topic.
    To merge the topics, double-click the first topic you want to rename to make it the active topic. You can rename the topic by deleting all of the text in the Topic field and replacing it with User Applications. Repeat this process for the other three topics that contain both user and application. When you rename a topic, the Category of the topic changes to User. Close the Interactive Filter Viewer and save the changes you made.
  8. Right-click the Text Topic node and select Run to rerun the Text Topic node. Click Yes in the Confirmation dialog box and click OK in the Run Status dialog box when the node is finished running.
  9. Now, click the ellipsis next to the Topic Viewer property to see the new topics that have been created. As you can see, there are now two user-defined topics, Internet and User Applications. However, the Text Topic node still created 25 topics. If you are not satisfied with these topics, you can continue to merge multiple topics together until the topics are distinct enough for your needs.