SAS Text Miner 4.1 is a plug-in for the SAS Enterprise Miner 6.1
environment. SAS Enterprise Miner provides a rich set of data mining
tools that facilitate the prediction aspect of text mining. The integration
of SAS Text Miner within SAS Enterprise Miner combines textual data
with traditional data mining variables. A Text Miner node can be embedded
into a SAS Enterprise Miner process flow diagram. SAS Text Miner supports
various sources of textual data: local text files, text as observations
in SAS data sets or external databases, and files on the Web. The
Text Miner node encompasses the parsing and exploration aspects of
text mining and prepares data for predictive mining and further exploration
using other SAS Enterprise Miner nodes. The Text Miner node enables
you to analyze structured text information, and combine the structured
output of a Text Miner node with other structured data as desired.
The Text
Miner node is highly customizable and enables you to choose among
a variety of parsing options. It is possible to parse documents for
detailed information about the terms, phrases, and other entities
in the collection. You can also cluster documents into meaningful
groups and report concepts that you discover in the clusters. You
can use the Text Miner node in an environment that enables you to
interact with the collection. Sorting, searching, filtering (subsetting),
and finding similar terms or documents all enhance the exploration
process.
The Text
Miner node's extensive parsing capabilities include the following:
-
-
automatic recognition of multi-word
terms
-
normalization of various entities
such as dates, currencies, percentages, and years
-
-
extraction of entities such as
organizations, products, Social Security numbers, time, titles, and
more
-
-
language-specific analysis for
English, German, Chinese, French, Spanish, Italian, and Portuguese
SAS Text
Miner also enables you to use a SAS macro that is called %TMFILTER.
This macro accomplishes a text preprocessing step and enables SAS
data sets to be created from documents that reside in your file system
or on Web pages. These documents can exist in a number of proprietary
formats.
SAS Text
Miner is a very flexible tool that can solve a variety of problems.
Here are some examples of tasks that can be accomplished using SAS
Text Miner:
-
-
grouping documents by topic into
predefined categories
-
-
clustering analysis of research
papers in a database
-
clustering analysis of survey data
-
clustering analysis of customer
complaints and comments
-
predicting stock market prices
from business news announcements
-
predicting customer satisfaction
from customer comments
-
predicting costs, based on call
center logs