SAS® Visual Text Analytics

Gain insights from data with a combination of natural language processing, machine learning and linguistic rules. SAS Visual Text Analytics in SAS® Viya® uses context to provide a comprehensive solution to the challenge of identifying and categorizing key textual data. You can build models (based on training documents) that analyze and categorize a set of documents, then customize them to realize the value of your text-based data.

white marble texture

The most recent release is SAS Visual Text Analytics 8.5.

What’s New

New GUI Features (available in Model Studio)

SAS Visual Text Analytics 8.5 offers new features and performance enhancements that provide greater flexibility and control when building models. It optimizes pipeline processing and run time for faster results.

New Features:

  • Use stratified sampling when automatically generating categories for large data sets to reduce pipeline run time.
  • Filter matches in the rule editor of the Edit Concept and Sandbox tabs by using the _SELF_ keyword in conjunction with REMOVE_ITEM and NO_BREAK rules.
  • Generate output data from a Concepts node and export it to SAS Visual Analytics for further analysis and visualization.
  • Use a report template in SAS Visual Analytics to quickly create reports from Concepts and Categories output data.
  • The autocomplete feature of the code editor in the interactive windows for the Concepts and Categories nodes is case-sensitive, preventing accidental automatic completion of a key word as an operator.
  • Add a pipeline template to a new or existing project in The Exchange.

Performance Enhancements:

  • Improved processing of Concepts and Categories nodes in a pipeline when no concepts or categories exist in the taxonomy.
  • Improved automatic rule generation when automatically generating rules for a concept. Any generated rules that are duplicates of an existing user-defined rule for that concept, or that do not add significant value to that concept, are removed.
  • Improved mapping of part-of-speech tags in conjunction with more optimal definitions of predefined concepts results in better matches for predefined concepts.
  • Improved text processing.

Improved Capabilities for Morphological Expansion in the Korean Language

Terms used in CONCEPT, C_CONCEPT, CONCEPT_RULE, SEQUENCE, PREDICATE_RULE, REMOVE_ITEM, and NO_BREAK rules are now automatically expanded to include inflectional forms. When scoring these rules with the Concepts node, the inflectional forms are matched in addition to the term in the rule. In other words, if the stem of a word (for example, "하다") is in a CONCEPT rule, inflectional forms in the dictionary (such as, "했다", "하면서", or "하시면") in a document are matched. If you do not want the word to expand, you can preserve the previous behavior in the Korean language by using a CLASSIFIER rule, then feed the matches into any of the other rule types. Note that you cannot use morphological expansion symbols in rules for the Korean language.  Category rules can refer to concept rules to take advantage of the expansion.

Improvements to Extraction of Predefined Concepts in the Danish Language

The ability to extract predefined concepts (including terms that denote a place, organization, person, or date) has been improved in the Danish language in SAS Visual Text Analytics 8.5. These improvements also include the ability to identify time expressions and percent value in documents, as well as improved accuracy identifying part-of-speech tags.

Improved Coverage of Predefined Concepts in the Spanish Language

In SAS Visual Text Analytics 8.5, the coverage of the predefined concepts nlpTime, nlpDate, nlpMoney, and nlpCurrency has been extended for the Spanish language. In addition, fully elaborated combined date-and-time spans such as "21 agosto a las 16:00" are now recognized in their entirety as nlpTime.

New Programming Features (available from a SAS Session)

New Features:

  • the ability to use the Sampling and Partitioning action set
  • a redesigned action
  • new action examples
  • new action parameters
  • the ability to search for special missing values
  • improved concept scoring in the Korean language
  • improvements to extraction of predefined concepts in the Danish language
  • improved coverage of predefined concepts in the Spanish language
  • performance enhancements

The Sampling and Partitioning Action Set

The SAS Visual Text Analytics 8.5 license includes the ability to use the Sampling and Partitioning action set.

Redesigned Rule Generation Action

The ruleGen action has been redesigned in SAS Visual Text Analytics 8.5 to generate LITI concept and fact rules that might be useful in your analysis.

New Action Examples

The SAS Visual Text Analytics 8.5: Programming Guide includes new examples to illustrate how you can work with LITI concept and fact rules using the ruleGen action:

  • Generate Concept Rules Using Output from the applyConcept Action as Input to the ruleGen Action
  • Generate Fact Rules Using Output from the applyConcept Action as Input to the ruleGen Action
  • Remove Concept Rules with Low Frequency
  • Select the Top N Number of Concept Rules Based on Their Frequency
  • Use an Exclude Table to Filter Concept Rules

The SAS Visual Text Analytics 8.5: Programming Guide includes new examples to illustrate how the tmMine and tpParse actions can reference a concepts model that is compiled in the compileConcept action:

  • Referencing a Concepts Model in the tmMine Action
  • Referencing a Concepts Model in the tpParse Action

Finally, the SAS Visual Text Analytics 8.5: Programming Guide includes a new example (Use Complex Part of Speech Tags with the tpAccumulate Action to Perform Distributed Accumulation) to illustrate how you can use the outComplexTag parameter in the tpParse action and the complexTag parameter in the tpAccumulate action to accumulate parsing results and generate parent, term, and child tables for distributed accumulation.

New Action Parameters

SAS Visual Text Analytics 8.5 includes the following new action parameters in the Search action set:

  • You can now use the casOut parameter in the searchAggregate action to send results to a CAS table.
  • You can now use the casOut parameter in the valueCount action to send results to a CAS table.
  • You can now use the trim parameter in the valueCount action to specify whether to trim the padding spaces of a JavaScript Object Notation (JSON) report.

SAS Visual Text Analytics 8.5 includes the new action parameter exclude in the ruleGen action in the Text Analytics Rule Development action set. The exclude parameter specifies an input table that contains the concept rules that you want to exclude from your analysis.

SAS Visual Text Analytics 8.5 includes the new option outputTableAnalysisLevel for the build parameter for the exportTextModel action in the Text Analytics Rule Development action set. The outputTableAnalysisLevel option enables you to specify which output tables are generated when exporting a sentiment analytic store (astore) model. A value of All for this option generates all output tables, and a value of DOCUMENT generates only a document-level sentiment output table. If you want to score a sentiment astore model using a Micro Analytic Service (MAS) or the SAS Embedded Process (EP), you should use the DOCUMENT value because MAS and EP support only one output table.

Search for Special Missing Values

You can now use the appendIndex and searchIndex actions in the Search action set to search for special missing values.

Improved Capabilities for Morphological Expansion in the Korean Language

Terms used in CONCEPT, C_CONCEPT, CONCEPT_RULE, SEQUENCE, PREDICATE_RULE, REMOVE_ITEM and NO_BREAK rules are now automatically expanded to include inflectional forms. When scoring these rules with the applyConcept action, the inflectional forms match in addition to the term in the rule. In other words, if the stem of a word (for example, "하다") is in a CONCEPT rule, inflectional forms in the dictionary (such as, "했다", "하면서", or "하시면") in a document are matched.

If you do not want the word to expand, you can preserve the previous behavior in the Korean language by using a CLASSIFIER rule, then feed the matches into any of the other rule types. Note that you cannot use morphological expansion symbols in rules for the Korean language. Category rules can refer to concept rules to take advantage of the expansion.

Improvements to Extraction of Predefined Concepts in the Danish Language

The ability to extract predefined concepts (including terms that denote a place, organization, person, or date) has been improved in the Danish language in SAS Visual Text Analytics 8.5. These improvements also include the ability to identify time expressions and percent value in documents, as well as improved accuracy identifying part-of-speech tags.

Improved Coverage of Predefined Concepts in the Spanish Language

In SAS Visual Text Analytics 8.5, the coverage of the predefined concepts nlpTime, nlpDate, nlpMoney, and nlpCurrency has been extended for the Spanish language. In addition, fully elaborated combined date-and-time spans such as "21 agosto a las 16:00" are now recognized in their entirety as nlpTime.

Performance Enhancements

SAS Visual Text Analytics 8.5 includes improved text processing.

Get Started

Ready to organize and extract useful information from large volumes of textual data? These resources are a good place to start.

Number 1

Watch video

Learn about the capabilities and functionality of SAS Visual Text Analytics.

Number 2

Learn the basics

Learn how to get started with these tutorials:

Number 3

Stay connected

Be a part of the community for SAS Visual Text Analytics. We offer a variety of ways for you to interact with users and experts.

Tutorials

Browse our library of free SAS Visual Text Analytics tutorials to learn something new or sharpen your skills.

More How-To Videos

background black and white

Documentation

Find user's guides and other technical documentation for SAS Visual Text Analytics.



Previous Versions

white marble texture

SAS Visual Text Analytics Blogs & Communities

Connect with other SAS users by joining a users group or attending an upcoming event.

Back to Top