What's New Table of Contents |
SAS Enterprise Miner 5.2 significantly expands both the feature set and architecture of Enterprise Miner.
The feature set is expanded by the following:
The architecture is enhanced by use of the SAS Analytics Platform. Using this platform has the following benefits:
The following sections detail the changes and enhancements.
The following changes and enhancements improve the installation, adminstration, and configuration of SAS Enterprise Miner 5.2.
The SAS Analytics Platform provides a common client/server architecture and implementation for a family of products that includes SAS Enterprise Miner, SAS Forecast Studio, and SAS Inventory Studio. One instance of the Analytics Platform middle-tier server can serve all three applications. It is easier for SAS administrators to install and configure multiple SAS analytical applications.
In three-tier environments, administrators can monitor the status of the Analytics Platform server through both Web-based and client-based tools. Users can download and install the Enterprise Miner client directly from the Analytics Platform server. Administrators who configure multi-user environments must manually configure and start the Analytics Platform middle-tier services.
The SAS Analytics Platform cannot be licensed separately. The Analytics Platform installation is triggered by the installation of any of the existing SAS Analytics Platform products.
In previous releases when you were running the application as a thin client, you could not cross a firewall to access the server. SAS Enterprise Miner 5.2 now provides this functionality.
As a result of the integration with the SAS Analytics Platform, SAS Enterprise Miner uses a new logon screen that is shared with SAS Forecast Studio and SAS Inventory Studio. You are no longer prompted for the SAS Metadata Server authentication domain when logging on to a server. Support has also been added for anonymous logons.
Some of the functionality that was in the configuration wizard in SAS Enterprise Miner 5.1 has been moved to a SAS Management Console plug-in. The SAS Management Console plug-in gives administrators better control over Enterprise Miner configurations. The following information is now stored in the SAS Management Console plug-in:
Grid processing is available for enterprises that need to perform computing over multiple logical or physical systems. The execution of the process flow diagram in SAS Enterprise Miner is sent to a load balancing manager that distributes the jobs to a grid of systems. This is expected to benefit users who run multiple, large-process flow diagrams, or users who manage a large multi-user environment.
SAS Enterprise Miner 5.2 contains the following new nodes that add functionality to the environment.
A Decisions node has been added to the SAS Enterprise Miner 5.2 tools menu. The Decisions node belongs to the Assess category of the SAS SEMMA (Sample, Explore, Modify, Model, Assess) data mining process. You use the Decisions node to define or modify target profiles for a target that produces optimal decisions. The decisions are made using a user-specified decision matrix and the output from a subsequent modeling procedure.
The Decisions node contains a tabbed Decision Processing window that you use to configure your decision matrix. The Decision Processing window shares the same layout as the Data Source Wizard Decision Configuration window.
Decision matrices are no longer automatically built when you open the Decision Editor. Click the new Build button to create the decision matrix for the desired target.
After a decision matrix has been created, you can refresh the default decision by clicking the Refresh button. You might want to refresh the data when the target levels in the underlying data have changed or when the target order has changed.
See Layout of a Target Profile in the Help for the Enterprise Miner Target Profiler for more information about configuring your decision matrix.
A Replacement node has been added to the SAS Enterprise Miner 5.2 tools menu. The Replacement node belongs to the Modify category of the SAS SEMMA (Sample, Explore, Modify, Model, Assess) data mining process. You use the SAS Enterprise Miner 5.2 Replacement node to generate score code to process unknown levels when scoring and also to interactively specify replacement values for class levels. The Replacement node must follow a node that exports a data set, such as a Data Source, Sample, or Data Partition node.
You can use the Replacement Editor to specify the replacement values. The Replacement Editor lists the class levels for the input and target variables in the training data set.
A SOM/Kohonen node has been added to the SAS Enterprise Miner 5.2 tools menu. The SOM/Kohonen node belongs to the Explore category of the SAS SEMMA (Sample, Explore, Modify, Model, Assess) data mining process. You use the SOM/Kohonen node to perform unsupervised learning by using Kohonen vector quantization (VQ), Kohonen self-organizing maps (SOMs), or batch SOMs with Nadaraya-Watson or local-linear smoothing. Kohonen VQ is a clustering method, whereas SOMs are primarily dimension-reduction methods.
The following nodes contain specific and significant enhancements in functionality.
Tree growth iteration plots have been added to the Decision Tree node results. A reference line in the plot indicates which subtree was selected.
Another enhancement to the Decision Tree output graphics is that Tree node plots now display information for all target levels.
Using the graph properties for the tree diagram, you can toggle between the detailed text (which is shown) and the summarized text.
The upper limit that existed for the Split Size and Node Sample properties has been removed.
The Filter node is enhanced to support interactive user selection of values to filter for both continuous and categorical variables.
In the Filter Node General Properties sheet, you can now do the following:
The code for renaming and replacing variables has been revised to accommodate processes such as the one shown here. In this case, both segment variables that were created by the two different Cluster nodes will be used as inputs to the model nodes, and the creator fields of the prediction variables will contain the value REG2 instead of TREE3.
Changes have been made to the predictive model comparison code to improve efficiency and consistency between reports. The Model Comparison node features a new Bin-based Kolmogorov-Smirnov statistic in the results tables.
The score code now creates the bin variable with a prefix of b_.
A baseline has been added to the Receiver Operating Characteristic (ROC) chart that is produced for binary targets.
The Path node has significant changes that reflect new capabilities in PROC PATH. Several new properties identify pattern-matching data sets. The scalability is improved. The Path node scoring has been greatly accelerated. New plots have been introduced to the Path node results, including the following view of rules sequences.
In the Path Node General Properties sheet, the new Training Mode property enables you to choose between Train and Score modes when the Path Analysis node runs. When the mode is set to Train, the node discovers a set of rules. When the mode is set to score, a previously found set of rules is applied to the transaction data set. You must specify a scoring rules data set when the Training Mode property is set to Score. The default setting for the Training Mode property is Train.
The new Funnel Counts plot shows the drop-off in the number of visitors along a particular path of interest. It can be useful to see how visitor attrition occurs along a path, indicating points of interest such as where the biggest drop-off points are.
Major improvements have been made to the graphics output of the Principal Components node.
Of special interest is the new Principal Components Matrix plot. In the following display, the plots are color coded by target event. Two clusters are clearly seen in the PC1 and PC2 dimensions. The initial matrix contains five rows and five columns. You can determine the number of subplots that appear.
Using the Interactive Principal Components Selection window, you can now select how many principal components should be retained for analysis.
The following changes have been made to the plots that are generated when you specify a model selection method:
The default for the Min Resource Use property is No when modeling a nominal or ordinal target and when specifying a model selection method. To improve performance but disable the model selection method, change this value to Yes.
The SAS Code node features new training and scoring code editors.
The SAS Code editor window has Macro Variables and Macros tabs that contain lists of macro variables and macros that are used in SAS training code. You can use your mouse to drag items from the Macro Variables and Macros lists and drop them in the SAS Code editor to enhance and simplify SAS code creation.
The Score Code editor window has a Variables tab above the score code editor. The Variables tab enables you to view data set variables at the same time as the score code under composition, which makes creating new scoring code easier.
The SAS Code node is also able to use new macros:
You now have better control of the sample makeup. The Level Based value has been added as an option to the Criterion property in the Sample Node Stratified Properties. If Level Based is selected, then the sample is based on the proportion captured and sample proportion of a specific level.
This criterion is applicable only when there is one stratification variable. Use the Level Based Options properties to specify parameters for the proportion captured, sample proportion, and level of interest.
You can now create data transformations by using the push button interfaces that are found in the Formula Builder and Expression Builder windows.
To open the Formula Builder window, select the Transform node in your process flow diagram, and then in the Transform Variables node properties panel, select the ellipsis button to the right of the Formula Builder entry.
The Formula Builder window is designed to respond to both keyboard and point-and-click mouse entries when you are building expressions. You can view distributions of both input variables and created (transformed) variables. The Formula Builder window enables you to see which variables need to be transformed as well as to see the distributions of the transformed variables that were created. A tab is available to examine and manipulate the Sample Properties.
New formulas are tested by executing the code against the preview sample and testing for errors. The preceding display shows a view of two newly created variables, including a plot of the second variable, TRANS_1.
The Expression Builder window contains a grouped list of functions, a list of variables, and an Expression Text box that has Boolean operator shortcut keys.
The assessment plots that are displayed in modeling nodes and the Model Comparison node have been enhanced to improve usability. The Score Rankings and Score Distribution plots now provide a selection control for conveniently changing the displayed statistics.
The graphics libraries in the SAS Enterprise Miner client have been significantly enhanced with improved performance and many new plot types including two-dimensional and three-dimensional graphics. The following plot shows the new parallel axis and density plots.
The Enterprise Miner wizard for adding data sources has been revised. The Data Source Wizard Decision Editor has been changed to support multiple decisions for categorical targets. The wizard leads you more clearly through the process of creating decisions and weights.
Changes to the underlying data (such as adding variables to the data source or removing variables from the data source) can affect the metadata of a data source that is defined in your project. SAS Enterprise Miner 5.2 now enables you to refresh the metadata for a defined data source rather than create a new data source. Right-click the data source in your project and select Refresh Metadata.
All SAS Enterprise Miner 5.2 nodes now have an Exported Data property. The Exported Data property is located near the top of each node's properties list, which is displayed in the SAS Enterprise Miner 5.2 properties panel. The Exported Data property makes it easier for you to find SAS data sets that were created by a node. Simply select a node in your process flow diagram, then click the ellipsis button to the right of the Exported Data property to open the Exported Data window for the selected node.
Changes that you make to the size of the Variables Editor, Data Explorer, Table Browser, Results Viewer, and Help window or to the layout of the main window are now persisted between sessions. This enables you to customize the layout of the main window to meet your needs.
The View menu in the Results Viewer has been reorganized for usability.
You can write your own XML files to create additional nodes to use in SAS Enterprise Miner. Starting in SAS Enterprise Miner 5.2, the deployment of custom nodes to clients has been simplified, so that all extension files (XML files and their icons) are managed from a central location.
The Getting Started with Enterprise Miner 5.2 book describes the core functionality of SAS Enterprise Miner 5.2 and how to perform basic data mining tasks. Using this book, you will learn the following:
This book is available from the SAS 9.1.3 OnlineDoc on the Web and is available for purchase from the SAS Publications Catalog.
SAS Credit Scoring is a new solution in SAS Enterprise Miner 5.2. This solution offers the ability to rapidly generate automated credit scoring models that rely on statistical models.
If your site has licensed SAS Credit Scoring, the credit scoring nodes will appear on the Credit Scoring tab in your Enterprise Miner session. SAS Enterprise Miner includes the following credit scoring nodes:
You can use these nodes to perform the following tasks:
The business value of statistical models can be assessed by using strategy curves, profit charts, and a reject inference process in order to produce models for scoring through-the-door populations. A credit exchange tool provides additional reporting of the credit scoring results and can exchange information with the SAS Credit Risk solution.
The Interactive Grouping and Scorecard nodes have been enhanced for this release.