The %TMFILTER macro is
a SAS macro that enables you to convert files into SAS data sets.
The %TMFILTER macro is provided with SAS Text Miner. It is supported
in all operating systems for filtering and on Windows for crawling.
The %TMFILTER macro relies on the SAS Document Conversion Server that
is installed and running on a Windows machine. See SAS Document Conversion
server for more information. You can use the macro to perform the
following tasks:
-
filter a collection of documents
that is saved in any supported file format and output a SAS data set
that can be used to create a SAS Text Miner data source.
-
Web crawl and output a SAS data
set that can be used to create a SAS Text Miner data source. Web crawling
retrieves the text of a starting Web page, extracts the URL links
within that page, and then repeats the process within the linked pages
recursively. You can restrict a crawl to the domain of the starting
URL, or you can let a crawl process any linked pages that are not
in the domain of the starting URL. The crawl continues until a specified
number of levels of drill-down is reached or until all the Web pages
that satisfy the domain constraint are found. Web crawling is supported
only on Windows operating systems.
-
identify the languages of all documents
in a collection.
See the SAS Text Miner
Help for more information about the %TMFILTER macro.