Whether
you intend to use textual data for descriptive purposes, predictive
purposes, or both, the same processing steps take place, as shown
in the following table:
|
|
|
|
Creates a single SAS
data set from your document collection. The SAS data set is used as
input for the Text Parsing node, and might
contain the actual text or paths to the actual text.
|
%TMFILTER macro —
a SAS macro for extracting text from documents and creating a predefined
SAS data set with a text variable
|
|
Decomposes textual data
and generates a quantitative representation suitable for data mining
purposes.
|
|
Transformation
(dimension reduction)
|
Transforms the quantitative
representation into a compact and informative format.
|
|
|
Performs classification,
prediction, or concept linking of the document collection. Creates
clusters, topics, or rules from the data.
|
SAS Enterprise Miner
predictive modeling nodes
|
Note: The
Text Miner node
is not available from the
Text Mining tab
in SAS Text Miner 12.1. The Text Miner node has now been replaced
by the functionality in other SAS Text Miner nodes. You can import
diagrams from a previous release of SAS Text Miner that had a
Text
Miner node in the process flow diagram. However, new
Text
Miner nodes can no longer be created, and property values
cannot be changed in imported
Text Miner nodes.
For more information, see the Converting SAS Text Miner Diagrams from
a Previous Version topic in the SAS Text Miner Help.
Finally, the rules for
clustering or predictions can be used to score a new collection of
documents at any time.
You might not need to
include all of these steps in your analysis, and it might be necessary
to try a different combination of options before you are satisfied
with the results.