What's New in SAS Information Retrieval Studio 1.3
General Enhancements
New and enhanced features
in SAS Information Retrieval Studio include the following:
-
SAS licensing replaces the Teragram
license.
-
The content_categorization Document
Processor wizard replaces the categorizer, concept_extractor, and
contextual_extractor processors.
-
The add_field Document Processor
enables you to add a field with a constant value to each input document.
-
The export_to_files document processor
now enables you to mark pre-escaped fields for XML documents. Use
this processor to create nested XML tags.
-
The parse_xml document processor
can now be instantiated multiple times. This feature enables you to
support multiple document schemas. This processor can also copy the
original URL of the compound document into each resulting, split document.
-
The export_csv document processor
now supports a non-escaped output mode.
-
Entry point quota control is now
available for the web crawler. This feature enables seed-only crawling.
-
The match_and_copy document processor
is similar to the substitute document processor. Use the match_and_copy
document processor to write the output to a different field from the
input.
-
The default fields
ctime
,
mtime
,
and
atime
are included in the
Input
fields to exclude field for the content categorization
document processor. These fields preclude these timestamps from processing
by SAS Content Categorization Server.
-
The passwords in the web crawler
Credentials pane are now obscured.
Copyright © SAS Institute Inc. All rights reserved.