What's New

What's New in SAS Enterprise Miner 6.2


Overview

SAS Enterprise Miner 6.2 delivers major new features targeted at specific business applications and complex information technology deployments. Rapid Predictive Modeling is a new feature for general business users that need to develop reliable models for predicting customer response and retention. The Interactive Decision Tree is enhanced to display more information about the nodes and leaves and to show more plots based on validation data. The credit scoring functions have been improved to provide more customization of bins and to provide more control over scorecard points.

SAS Enterprise Miner 6.2 provides enhanced support for working with databases, specifically Teradata v.13. New Enterprise Miner in-database functions can reduce data movement between the database system and the SAS system, improving overall solution efficiency.

SAS Enterprise Miner 6.2 also provides enhancements to several general purpose functions in response to both user feeedback, as well as supporting the Rapid Predictive Modeling and In-Database projects.

For more information, see What's New in SAS Enterprise Miner 6.1 and What's New in SAS Enterprise Miner 6.1 Maintenance Release.


Extended Support for Teradata 13

Extended Support for Teradata 13 is targeted at users that work with large data sources that are stored in Teradata warehouses. SAS Enterprise Miner provides in-database processing during data access, summary, and sampling functions, which reduces the amount of data movement from the database system to the SAS System. Reducing data movement can enhance speed, efficiency, or improve resource allocation.

In-database computation is provided through a combination of SAS generated SQL queries and SAS provided embedded functions that were written for the database. These features take advantage of the database's ability to use parallel computing architecture.

The following functions in SAS Enterprise Miner 6.2 take advantage of in-database processing:

Computing exploratory distribution statistics and creating training data samples using in-database technology reduces data movement to the minimum amount. Model building occurs using the highly optimized SAS system. Model scoring can then be performed on the full population in-database, which completes an efficient in-database modeling cycle.


SAS Rapid Predictive Modeler

SAS Rapid Predictive Modeler is a new feature for users who need to rapidly create prediction and classification models for common business problems. The modeling process is highly automated and the results may be integrated into SAS Enterprise Miner for scoring, analysis, and modification. SAS Rapid Predictive Modeler must be executed from either SAS Enterprise Guide or the SAS Add-In for Microsoft Office.

SAS Rapid Predictive Modeler is not installed with SAS Enterprise Miner 6.2. After you install Enterprise Miner 6.2, you can re-run the SAS Deployment Wizard if you want to install the SAS Rapid Predictive Modeler add-on.


Interactive Decision Tree Functionality

The Interactive Decision Tree now displays validation data statistics and multiple target statistics concurrently. The user interface has been revised to display more information for selected decision tree nodes. The enhanced information includes:


SAS Credit Scoring Node Changes

SAS Enterprise Miner 6.2 provides enhancements to the following credit scoring nodes:

Interactive Grouping: The Interactive Grouping node for SAS Enterprise Miner 6.2 provides the following enhancements:

  • The Interactive Grouping node provides users with the ability to sort the Variables table while in the interactive mode. Users may now sort the table by values in any column.
  • The combo box used to select variables in the Variables tab during interactive training no longer exists. Variable selection is now performed by highlighting rows in the variable table itself, and then clicking a Select button displayed on the Variables tab.
  • Additional information such as variable labels and pre-defined grouping flags are added to the interactive training Variables tab. The pre-defined grouping column indicates grouping definitions that were created from frozen or imported groupings.
  • The Fine Detail plot for Interactive Training has been enhanced to display the Weights of Evidence (WOE) distribution overlaid on the Event Rate plot.
  • The ability to group class input variables across different training data sets has been made more robust. When new training data introduces new class variable values that were not seen in previous training data, the values are grouped in a new _UNKNOWN_ category for nominal input variables, instead of grouping them with _MISSING_ data values. The node also retains grouping definitions for variable values that are no longer found in the training data set.
  • The ability to interactively group input interval variables with extreme values has been made more robust. Users can create cutoff values that are outside of the range of variable values in the training data.
  • A new property called Apply Saved WOE Value lets users choose how to handle WOE values in the presence of frozen or imported grouping definitions. Settings for the new property permit users to use only the calculated WOEs for the imported or frozen grouping definition, or to use only the WOE values that were manually overwritten when the frozen or imported grouping definitions were created, or to use all WOE values that were created when the frozen or imported grouping definitions were created.

Scorecard: The Scorecard node for SAS Enterprise Miner 6.2 provides the following enhancements:

  • A new property called Freeze lets users protect scorecard points so that they will not be recalculated and overwritten if the Scorecard node is flagged to re-run. Settings for the new property permit users to use only newly calculated scorecard points, or to use frozen group definitions for variables that were defined in a previous IGN node, or to use all frozen scorecard points from the original run.
  • A new property called Intercept Based Scorecard allows users to generate scorecards that contain points associated with the intercept term itself. This provides an easy way to adjust the entire scorecard with a single change to the intercept value. At the same time, intercept-based scorecards are also scaled in such a way that all non-intercept attributes are associated with positive scorecard points with a zero-basis value.
  • Variable labels are displayed in the generated scorecard instead of variable names.

Enterprise Miner Node Changes

SAS Enterprise Miner 6.2 provides enhancements to the following nodes:

Input Data: The Input Data node for SAS Enterprise Miner 6.2 provides the following enhancements:

  • The Explore table for the Input Data node has been enhanced to take advantage of in-database processing when the source table is located on a database system. In-database processing creates the Explore table data sample on the database system side instead of using Enterprise Miner resources to create the sample.
  • Enterprise Miner 6.2 now can utilize the SAS option VALIDVARNAME=ANY to allow users to create Enterprise Miner Data Sources from data tables whose rows or columns use normally forbidden characters such as #, $, %, &, and so on.
  • A new Input Data node property called DropMapVariables lets users choose whether to drop remapped VALIDVARNAME=ANY variables from the exported score code.

Transform Variables: The Transform Variables node for SAS Enterprise Miner 6.2 provides the following enhancements:

  •  The Transform node now supports LOG10 (base 10 logarithm) as a standard transformation for interval input and target variables.

Reporter: The Reporter node for SAS Enterprise Miner 6.2 provides the following enhancements:

  •  Users can create summary style reporting for generated models within the Reporter node. A new Summary property in the Reporter node enables or disables summary reporting. A new grouping of Summary Option properties in the property sheet enables users to select the components of the summary report. A modeling node must be present in the process flow diagram before the summary reporting feature is available.

Filter: The Filter node for SAS Enterprise Miner 6.2 provides the following enhancements:

  •  The Filter node provides improved performance on large data sets.
  • A new property called Distribution Data Sets lets users control whether intermediate summary data sets for interval and class variables are created. The summary data sets are used to generate histograms and bar charts in the interval and class variable editors during interactive filtering.

Score: The Score node for SAS Enterprise Miner 6.2 provides the following enhancements:

  •  A new Score node property called Graphical Reports lets users choose which graphical reports appear in the Score node Results.

SAS Enterprise Miner System Changes

SAS Enterprise Miner system changes include improvements made to the user interface and background processing algorithms, including Wizard templates, sampling strategies, metadata processing, and data visualization functions.

Data Source Wizard

You use the Data Source Wizard to configure data for use by SAS Enterprise Miner. When you use the Data Source Wizard to select data from a compatible Teradata database, the database performs all of the data access and summary functions, as well as creating an optional data sample.

When you select the Advanced Advisor option of the Data Source Wizard, the data summary that computes the data distribution metrics also sets data mining variable roles (such as ID, INPUT, and REJECTED) and data mining variable levels (such INTERVAL, ORDINAL, and NOMINAL). When you select the Create A Sample option in the Data Source Wizard, Enterprise Miner creates a sample according to the proportion of the data or number of data rows that you specify. The SAS log provides notes to indicate when data manipulation, summary, and sampling work was performed by the database and not by SAS Enterprise Miner.

Explore Data Sample

When you use the SAS Library Explorer to select one of the Explore Data functions in SAS Enterprise Miner, the software performs a data sample. The sample is used to create interactive graphics such as histograms and scatter plots. When you select data in a compatible Teradata database, the sample function is performed by the database, thus eliminating a large data transfer from the database to the SAS system.

Support for SAS Option VALIDVARNAME=ANY

Data tables that were created or saved using the SAS option VALIDVARNAME=ANY can now be processed by SAS Enterprise Miner. The SAS option VALIDVARNAME=ANY permits the use of normally forbidden characters (forbidden according to SAS V6 rules) used in naming data table columns. The Enterprise Miner Input Data node creates score code by renaming those columns. The metadata table in the Data Source Wizard and in the Input Data Source node will show the original variable names. However, subsequent nodes will show the remapped variable names in the metadata tables. If a variable has no assigned label, then a new label that corresponds to the original variable name is assigned and displayed in the metadata tables. If a label was already assigned, then that label is used.

The in-database sampling feature also supports the VALIDVARNAME=ANY option, so you can select column names that have special characters that are supported by both Teradata and SAS.

View Metadata

The Enterprise View Metadata function enables you to explore the data mining content of Enterprise Miner objects that reside in the SAS Metadata Server. The View Metadata function includes the ability to explore both Enterprise Miner projects and Enterprise Miner models. You can use the View Metadata function to view, rename, and delete Enterprise Miner projects and models in the SAS Metadata Server. To start a Metadata Explorer session, from the Enterprise Miner main menu, select View Then Metadata.