SAS
Text Miner is a plug-in for the SAS Enterprise Miner environment.
SAS Enterprise Miner provides a rich set of data mining tools that
facilitate the prediction aspect of text mining. The integration of
SAS Text Miner within SAS Enterprise Miner combines textual data with
traditional data mining variables. Text mining nodes can be embedded
into a SAS Enterprise Miner process flow diagram. SAS Text Miner supports
various sources of textual data: local text files, text as observations
in SAS data sets or external databases, and files on the Web.
SAS Text Miner 12.1
includes the following nodes that you can use in your text mining
analysis:
For more information
about the SAS Text Miner nodes, see the corresponding chapter in this
book, or the SAS Text Miner Help.
Together, the Text Miner
nodes encompass the parsing and exploration aspects of text mining
and the preparation of data for predictive mining and further exploration
when you use other SAS Enterprise Miner nodes. You can analyze structured
text information, and combine the structured output of the Text Miner
nodes with other structured data as desired. The Text Miner nodes
are highly customizable and enable you to choose among a variety of
options. For example, the
Text Parsing node
enables you to parse documents for detailed information about the
terms, phrases, and other entities in the collection. The
Text
Cluster node enables you to cluster documents into meaningful
groups and to report concepts that you discover in the clusters. Sorting,
searching, filtering (subsetting), and finding similar terms or documents
all enhance the exploration process.
SAS Text Miner also
enables you to use a SAS macro that is called %TMFILTER. This macro
accomplishes a text preprocessing step and enables SAS data sets to
be created from documents that reside in your file system or on Web
pages. These documents can exist in a number of proprietary formats.
SAS Text Miner is a
flexible tool that can solve a variety of problems. Here are some
examples of tasks that can be accomplished using SAS Text Miner:
-
-
grouping documents by topic into
predefined categories
-
-
clustering analysis of research
papers in a database
-
clustering analysis of survey data
-
clustering analysis of customer
complaints and comments
-
predicting stock market prices
from business news announcements
-
predicting customer satisfaction
from customer comments
-
predicting costs, based on call
center logs