What's New in SAS Information Retrieval Studio 1.3

General Enhancements

New and enhanced features in SAS Information Retrieval Studio include the following:
  • SAS licensing replaces the Teragram license.
  • The content_categorization Document Processor wizard replaces the categorizer, concept_extractor, and contextual_extractor processors.
  • The add_field Document Processor enables you to add a field with a constant value to each input document.
  • The export_to_files document processor now enables you to mark pre-escaped fields for XML documents. Use this processor to create nested XML tags.
  • The parse_xml document processor can now be instantiated multiple times. This feature enables you to support multiple document schemas. This processor can also copy the original URL of the compound document into each resulting, split document.
  • The export_csv document processor now supports a non-escaped output mode.
  • Entry point quota control is now available for the web crawler. This feature enables seed-only crawling.
  • The match_and_copy document processor is similar to the substitute document processor. Use the match_and_copy document processor to write the output to a different field from the input.
  • The default fields ctime , mtime , and atime are included in the Input fields to exclude field for the content categorization document processor. These fields preclude these timestamps from processing by SAS Content Categorization Server.
  • The passwords in the web crawler Credentials pane are now obscured.