What’s New In SAS LASR Analytic Server
Overview
SAS LASR Analytic Server
2.3 includes the following changes:
-
Analytic features for the IMSTAT
procedure
-
The RECOMMEND procedure and support
for recommender applications
-
Support for SAS In-Memory Statistics
for Hadoop
-
Support for in-memory text analytics
-
Enhancements to data and server
management for the IMSTAT procedure
-
Documentation enhancements
Analytic Features for the IMSTAT Procedure
The IMSTAT procedure
is enhanced with numerous statements that enable in-memory analytics.
These statements
are included in IMSTAT Procedure (Analytics). They are licensed separately from
the data and server management statements.
The GLM, LOGISTIC, and
GENMODEL statements are available for performing a variety of modeling
techniques. Each of these statements have the following features:
-
You can specify the CODE option
so that the server generates and saves SAS scoring code.
-
You can assign a role variable
to divide the data into training and validation sets. Alternatively,
you can request that the server divides the data at random into these
sets by specifying the proportion for validation data. See the ROLEVAR=,
SEED=, and VALIDATE= options in these statements.
-
You can apply group-by filtering
by passing the name of a temporary table generated by the GROUPBY
statement and you can define rules for extracting groups from the
group-by set. This enables you to restrict model fitting to groups
of interest. For example, groups in which the average of some measure
exceeds a particular value, or fitting models across groups in stages
to prevent overwhelming the SAS session.
-
You can specify the INFORMATIVE
option to account for missing values in the data with an informative
missing algorithm. The option adds additional dummy effects to the
model and replaces missing values with the effect mean. The coefficients
for the dummy effects estimate the difference between the response
predicted by the missing value grouping and the predicted response
if the effect is evaluated at its mean. This enables you to use all
the data in estimating and scoring a model, without additional imputation
steps.
The RANDOMWOODS statement
generates a collection of decision trees. Each tree is built from
a bootstrap sample of the data and each tree is based on a random
selection of variables. The collection of trees can be used generally
as classification and regression trees. The current implementation
is for use in classification problems. You can also specify a CODE
option to generate and save SAS scoring code.
The ASSESS statement
is used to assess the quality of one or more statistical models. For
a set of classification models, you can compute model lift, receiver-operating
characteristic (ROC), and concordance statistics. You can also apply
the ASSESS statement to regression-type models.
RECOMMEND Procedure
The RECOMMEND procedure
is a new procedure that enables you develop a recommender system.
A common goal for a recommender system is to make personalized recommendations
to individuals who lack the capacity, experience, or resources to
select from a potentially overwhelming list of choices.
The procedure enables
building content-based recommender systems and collaborative filtering
systems.
SAS In-Memory Statistics for Hadoop
SAS LASR Analytic Server
and the analytic statements for the IMSTAT procedure provide the core
features that are available with SAS In-Memory Statistics for Hadoop.
SAS/ACCESS Interface for Hadoop is included and enables you to access
your data that is stored in HDFS in a variety of formats.
The bundle includes
other products, such as SAS Studio and
SAS/STAT. This document covers
the features available in the server and the IMSTAT procedure.
Support for Text Analytics
SAS LASR Analytic Server
can work on unstructured text, such as social media feeds, news articles,
and arbitrary collection of documents. The server uses text analytics
to turn the unstructured text into numbers and counts and weights,
and terms and topics with relationships, suitable for visualization
and exploration.
For more
information, see the TEXTPARSE Statement and Text Analytics in SAS LASR Analytic Server.
Enhancements to Data and Server Management for the IMSTAT Procedure
The statements that
were introduced in previous releases are included in IMSTAT Procedure (Data and Server Management). Some of the enhancements are as follows:
-
A BATCHMODE option is added to
the procedure. When the option is enabled and an error occurs, the
procedure terminates and sets the SYSERR macro variable.
-
The SCORE statement is enhanced
as follows:
-
You can specify the names of additional
in-memory tables that contain information about key-value pairs for
DATA step hash objects.
For more information,
see the HASHDATA option.
-
The DSRETAIN option is added. When
the option is specified, scoring code behaves like the DATA step with
respect to retention of output symbols.
-
The STORE statement is enhanced
to support complex expressions. For example, you can build a string
for a WHERE clause from the contents of IMSTAT procedure result tables.
-
The FETCH statement is enhanced
as follows:
-
You can specify formats for the
variables to retrieve.
-
You can specify an ORDERBY= option
to retrieve rows from the server that are sorted.
-
You can specify format instructions
for ORDERBY= variables and you can choose whether each variable is
sorted by unformatted or formatted values. You can also specify whether
the sort order is ascending or descending.
The TABLEINFO statement
is enhanced with the PARTVARS option. This option enables displaying
the names of the partition variables and order-by variables for tables.
Documentation Enhancements
Copyright © SAS Institute Inc. All rights reserved.