This paper compares the performance of Bayesian network classifiers to other popular classification methods such as classification tree, neural network, logistic regression, and support vector machines. It also shows some real-world applications of the implemented Bayesian network classifiers and a useful visualization of the results.
This paper shows how you can use the HPSVM procedure from SAS Enterprise Miner to implement both training and scoring of these multinomial classification extensions to the traditional SVM algorithm. It also demonstrates these implementations on several data sets to illustrate the benefits of these methods.
This paper uses text mining and time series analysis techniques to explore Don Quixote de la Mancha, a two-volume master work of Western literature. The temporal text mining methods demonstrated in this paper lend themselves to business applications such as monitoring changes in customer sentiment and summarizing research and legislative trends.
This paper discusses many of the most common issues faced by machine learning practitioners and provides guidance for using these powerful algorithms to build effective models.
This paper presents supervised and unsupervised pattern recognition techniques that use Base SAS and SAS Enterprise Miner software.
This paper summarizes the theoretical background of recent ensemble techniques and presents examples of real-world applications.
This paper demonstrate, using sales information and SAS Enterprise Miner, how to uncover relative price bands where price can be increased without losing market share or decreased slightly to gain market share.
This paper shows you how to leverage SAS Asset Performance Analytics and SAS Enterprise Miner to build a model for drilling and well control anomalies, to fingerprint key well control measures of the transient fluid properties, and to operationalize these analytics on the drilling assets with SAS Event Stream Processing.
This paper shows how to implement data preparation through SAS Enterprise Miner, using different approaches.
This paper first summarizes the problems that were specified and data that were supplied by the Challenge sponsors at Cloudera. Then it outlines the techniques and technologies used to complete the Challenge, followed by sections that describe in greater detail the approaches used for data preprocessing and for completing the Challenge deliverables.
This paper describes various feature extraction methods for time series data that are implemented in SAS Enterprise Miner.
This paper first explains the concepts of association discovery, sequence discovery, multiple centrality measures and clustering coefficient measure, and item clusters. Then it shows how the Link Analysis node incorporates these concepts in analyzing transactional data. The paper also shows how you can adapt non-transactional data to the link analysis framework. Finally, examples illustrate how to use the Link Analysis node to analyze Netflix data and Fisher’s Iris data.
This paper describes three types of ensemble models: boosting, bagging, and model averaging. It discusses go-to methods, such as gradient boosting and random forest, and newer methods, such as rotational forest and fuzzy clustering. The examples section presents a quick setup that enables you to take fullest advantage of the ensemble capabilities of SAS Enterprise Miner by using existing nodes, Start Groups and End Groups nodes, and custom coding.
This paper provides an overview of machine learning and presents several supervised and unsupervised machine learning examples that use SAS Enterprise Miner. Download the zip file
This paper reviews SAS Enterprise Miner 13.1, which focuses on these three themes; it provides 10 new nodes, three new procedures, and algorithmic and technological enhancements.
This paper discusses several new methods available in Credit Scoring for SAS Enterprise Miner that help build scorecards that are based on interval targets.
The Credit Scoring add-on in SAS Enterprise Miner is widely used to build binary target (good, bad) scorecards for probability of default. The process involves grouping variables using weight of evidence, and then performing logistic regression to produce predicted probabilities. This paper will demonstrate how to use the same tools to build binned variable scorecards for Loss Given Default, explaining the theoretical principles behind the method and use actual data to demonstrate how it was done.
This paper benchmarks SAS and open-source products to analyze big data by modeling four classification problems from real customers. The products that were benchmarked are SAS Rapid Predictive Modeler (a component of SAS Enterprise Miner), SAS High-Performance Analytics Server (using Hadoop), R and Apache Mahout. Results were compared in terms of model quality, modeler effort, scalability and completeness.
This paper compares the performance of the HPGENSELECT procedure with results cited for the RevoScaleR package by using data that are similar to the insurer's data. The paper also demonstrates the scalability of the HPGENSELECT procedure by using two sizes of data sets and three different computing environments.
This paper discusses the options and methods available for use in High- Performance Data Mining and uses real data for performance benchmarks.
This paper shows you how to identify hundreds of champion models using SAS Factory Miner, while generating scoring web services using SAS Decision Manager.
This paper showcases a repeatable combination of exploratory and classification-based text analytics provided by SAS Contextual Analysis, applied to the publicly available ACLED for African states.
In this presentation, using SAS code and SAS Text Miner, we compare supervised and unsupervised models with those that are based on SVD representations of subcomponents of documents.
This paper demonstrates how to use SAS Text Miner macros and procedures to obtain effective predictive models at all hierarchy levels in a taxonomy.
This paper takes a quick look at how to organize and analyze textual data for extracting insightful customer intelligence from a large collection of documents and for using such information to improve business operations and performance.
SAS Text Miner 12.1 and SAS Content Categorization Studio 12.1 is used to develop a rule-based categorization model. This model is then used to automatically score a paper abstract to identify the most relevant and appropriate conference sections to submit to for a better chance of acceptance.
This paper demonstrates a new and powerful feature in SAS Text Miner 12.1 which helps in explaining the SVDs or the text cluster components. Discussed also are two important methods useful to interpret them.
This paper demonstrates how to use SAS Text Miner procedures to process sparse data sets and generate output data sets that are easy to store and can be readily processed by traditional SAS modeling procedures.