This paper describes various feature extraction methods for time series data that are implemented in SAS Enterprise Miner.
This paper first explains the concepts of association discovery, sequence discovery, multiple centrality measures and clustering coefficient measure, and item clusters. Then it shows how the Link Analysis node incorporates these concepts in analyzing transactional data. The paper also shows how you can adapt non-transactional data to the link analysis framework. Finally, examples illustrate how to use the Link Analysis node to analyze Netflix data and Fisher’s Iris data.
This paper describes three types of ensemble models: boosting, bagging, and model averaging. It discusses go-to methods, such as gradient boosting and random forest, and newer methods, such as rotational forest and fuzzy clustering. The examples section presents a quick setup that enables you to take fullest advantage of the ensemble capabilities of SAS Enterprise Miner by using existing nodes, Start Groups and End Groups nodes, and custom coding.
This paper provides an overview of machine learning and presents several supervised and unsupervised machine learning examples that use SAS Enterprise Miner. Download the zip file
This paper reviews SAS Enterprise Miner 13.1, which focuses on these three themes; it provides 10 new nodes, three new procedures, and algorithmic and technological enhancements.
This paper discusses several new methods available in Credit Scoring for SAS Enterprise Miner that help build scorecards that are based on interval targets.
The Credit Scoring add-on in SAS Enterprise Miner is widely used to build binary target (good, bad) scorecards for probability of default. The process involves grouping variables using weight of evidence, and then performing logistic regression to produce predicted probabilities. This paper will demonstrate how to use the same tools to build binned variable scorecards for Loss Given Default, explaining the theoretical principles behind the method and use actual data to demonstrate how it was done.
This paper benchmarks SAS and open-source products to analyze big data by modeling four classification problems from real customers. The products that were benchmarked are SAS Rapid Predictive Modeler (a component of SAS Enterprise Miner), SAS High-Performance Analytics Server (using Hadoop), R and Apache Mahout. Results were compared in terms of model quality, modeler effort, scalability and completeness.
This paper compares the performance of the HPGENSELECT procedure with results cited for the RevoScaleR package by using data that are similar to the insurer's data. The paper also demonstrates the scalability of the HPGENSELECT procedure by using two sizes of data sets and three different computing environments.
This paper discusses the options and methods available for use in High- Performance Data Mining and uses real data for performance benchmarks.
This paper takes a quick look at how to organize and analyze textual data for extracting insightful customer intelligence from a large collection of documents and for using such information to improve business operations and performance.
SAS Text Miner 12.1 and SAS Content Categorization Studio 12.1 is used to develop a rule-based categorization model. This model is then used to automatically score a paper abstract to identify the most relevant and appropriate conference sections to submit to for a better chance of acceptance.
This paper demonstrates a new and powerful feature in SAS Text Miner 12.1 which helps in explaining the SVDs or the text cluster components. Discussed also are two important methods useful to interpret them.
This paper demonstrates how to use SAS Text Miner procedures to process sparse data sets and gen-erate output data sets that are easy to store and can be readily processed by traditional SAS modeling procedures.