SAS Technical Papers » Data Mining and Text Mining

SAS Enterprise Miner

2017 Papers

Building Bayesian Network Classifiers Using the HPBNET Procedure
Liu, Ye; Shi, Weihua; Czika, Wendy; SAS Institute, Inc.; 2017
This paper compares the performance of Bayesian network classifiers to other popular classification methods such as classification tree, neural network, logistic regression, and support vector machines. It also shows some real-world applications of the implemented Bayesian network classifiers and a useful visualization of the results.
Methods of Multinomial Classification Using Support Vector Machines
Abbey, Ralph; He, Taiping; Want, Tao; SAS Institute, Inc.; 2017
This paper shows how you can use the HPSVM procedure from SAS Enterprise Miner to implement both training and scoring of these multinomial classification extensions to the traditional SVM algorithm. It also demonstrates these implementations on several data sets to illustrate the benefits of these methods.
Temporal Text Mining: A Thematic Exploration of Don Quixote
Wright, Ray; SAS Institute, Inc.; 2017
This paper uses text mining and time series analysis techniques to explore Don Quixote de la Mancha, a two-volume master work of Western literature. The temporal text mining methods demonstrated in this paper lend themselves to business applications such as monitoring changes in customer sentiment and summarizing research and legislative trends.

2016 Papers

Best Practices for Machine Learning Applications
Wujek, Brett; Gunes, Funda; Hall, Patrick; SAS Institute, Inc.; 2016
This paper discusses many of the most common issues faced by machine learning practitioners and provides guidance for using these powerful algorithms to build effective models.
An Efficient Pattern Recognition Approach with Applications
Hall, Patrick; Chien, Alex; Kabul, Ilknur; Silva, Jorge; SAS Institute, Inc.; 2016
This paper presents supervised and unsupervised pattern recognition techniques that use Base SAS and SAS Enterprise Miner software.
Ensemble Modeling: Recent Advances and Applications
Czika, Wendy; Liu, Ye; SAS Institute, Inc.; 2016
This paper summarizes the theoretical background of recent ensemble techniques and presents examples of real-world applications.

2015 Papers

Clustering Techniques to Uncover Relative Pricing Opportunities: Relative Pricing Corridors Using SAS Enterprise Miner and SAS Visual Analytics
Carr, Ryan; SAS Institute, Inc.; Park, Charles, Park; Lenovo; 2015
This paper demonstrate, using sales information and SAS Enterprise Miner, how to uncover relative price bands where price can be increased without losing market share or decreased slightly to gain market share.
Drilling for Deepwater Data: A Forensic Analysis of the Gulf of Mexico Deepwater Horizon Disaster
Walker, Steve; Duarte, Jim; SAS Institute, Inc.; 2015
This paper shows you how to leverage SAS Asset Performance Analytics and SAS Enterprise Miner to build a model for drilling and well control anomalies, to fingerprint key well control measures of the transient fluid properties, and to operationalize these analytics on the drilling assets with SAS Event Stream Processing.
Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner
Galante, Ricardo; SAS Institute, Inc.; 2015
This paper shows how to implement data preparation through SAS Enterprise Miner, using different approaches.
SAS Does Data Science: How to Succeed in a Data Science Competition
Hall, Patrick; SAS Institute, Inc.; 2015
This paper first summarizes the problems that were specified and data that were supplied by the Challenge sponsors at Cloudera. Then it outlines the techniques and technologies used to complete the Challenge, followed by sections that describe in greater detail the approaches used for data preprocessing and for completing the Challenge deliverables.

2014 Papers

"Extra" SAS Global Forum Papers

Feature Extraction Methods for Time Series Data in SAS Enterprise Miner
Lee, Taiyeong; Zhang, Ruiwen; Xiao, Yongqiao; Dean, Jared; SAS Institute, Inc. 2014
This paper describes various feature extraction methods for time series data that are implemented in SAS Enterprise Miner.
Link Analysis Using SAS Enterprise Miner
Liu, Ye; Zhang, Ruiwen; Dean, Jared; SAS Institute, Inc. 2014
This paper first explains the concepts of association discovery, sequence discovery, multiple centrality measures and clustering coefficient measure, and item clusters. Then it shows how the Link Analysis node incorporates these concepts in analyzing transactional data. The paper also shows how you can adapt non-transactional data to the link analysis framework. Finally, examples illustrate how to use the Link Analysis node to analyze Netflix data and Fisher’s Iris data.

SAS Global Forum Papers

Leveraging Ensemble Models in SAS Enterprise Miner
Maldonado, Miguel; Dean, Jared; Czika, Wendy; Haller, Susan; SAS Institute, Inc. 2014
This paper describes three types of ensemble models: boosting, bagging, and model averaging. It discusses go-to methods, such as gradient boosting and random forest, and newer methods, such as rotational forest and fuzzy clustering. The examples section presents a quick setup that enables you to take fullest advantage of the ensemble capabilities of SAS Enterprise Miner by using existing nodes, Start Groups and End Groups nodes, and custom coding.
An Overview of Machine Learning with SAS Enterprise Miner
Hall, Patrick; Dean, Jared; Kabul, Ilknur Kaynar; Silva, Jorge; SAS Institute, Inc. 2014
This paper provides an overview of machine learning and presents several supervised and unsupervised machine learning examples that use SAS Enterprise Miner. Download the zip file
What’s New in SAS Enterprise Miner 13.1
Dean, Jared; Wexler, Jonathan; SAS Institute, Inc. 2014
This paper reviews SAS Enterprise Miner 13.1, which focuses on these three themes; it provides 10 new nodes, three new procedures, and algorithmic and technological enhancements.

See other SAS Enterprise Miner technical papers.

SAS Credit Scoring

2013 Papers

Creating Interval Target Scorecards with Credit Scoring for SAS Enterprise Miner
Maldonado, Miguel; Haller, Susan; Czika, Wendy; Siddiqi, Naeem; SAS Institute, Inc. 2013
This paper discusses several new methods available in Credit Scoring for SAS Enterprise Miner that help build scorecards that are based on interval targets.

2012 Papers

Building Loss Given Defaults Scorecard Using Weight of Evidence Bins in SAS Enterprise Miner
Van Berkel, Anthony, Bank of Montreal; Siddiqi, Naeem, SAS Institute, Inc. 2012
The Credit Scoring add-on in SAS Enterprise Miner is widely used to build binary target (good, bad) scorecards for probability of default. The process involves grouping variables using weight of evidence, and then performing logistic regression to produce predicted probabilities. This paper will demonstrate how to use the same tools to build binned variable scorecards for Loss Given Default, explaining the theoretical principles behind the method and use actual data to demonstrate how it was done.

See other SAS Credit Scoring technical papers.

Data Mining

2013 Papers

Big Data Analytics: Benchmarking SAS, R, and Mahout
Ames, Allison J.; Abbey, Ralph; Thompson, Wayne; SAS Institute, Inc. 2013
This paper benchmarks SAS and open-source products to analyze big data by modeling four classification problems from real customers. The products that were benchmarked are SAS Rapid Predictive Modeler (a component of SAS Enterprise Miner), SAS High-Performance Analytics Server (using Hadoop), R and Apache Mahout. Results were compared in terms of model quality, modeler effort, scalability and completeness.
Scalability of the SAS/STAT HPGENSELECT High-Performance Analytical Procedure: A Comparison with RevoScaleR
Thompson, Wayne; Ames, Jennifer; Ho, Dright; SAS Institute, Inc. 2013
This paper compares the performance of the HPGENSELECT procedure with results cited for the RevoScaleR package by using data that are similar to the insurer's data. The paper also demonstrates the scalability of the HPGENSELECT procedure by using two sizes of data sets and three different computing environments.

2012 Papers

A New Age of Data Mining in the High-Performance World
Dean, Jared; Duling, David; Thompson, Wayne; SAS Institute, Inc. 2012
This paper discusses the options and methods available for use in High- Performance Data Mining and uses real data for performance benchmarks.

SAS Factory Miner

2016 Papers

Mass-Scale, Automated Machine Learning and Model Deployment Using SAS Factory Miner and SAS Decision Manager
Wexler, Jonathan; Sparano, Steve; SAS Institute, Inc. 2016
This paper shows you how to identify hundreds of champion models using SAS Factory Miner, while generating scoring web services using SAS Decision Manager.

Text Mining

2016 Papers

Extending the Armed Conflict Location and Event Data Project with SAS Contextual Analysis
Sabo, Tom; SAS Institute, Inc. 2016
This paper showcases a repeatable combination of exploratory and classification-based text analytics provided by SAS Contextual Analysis, applied to the publicly available ACLED for African states.
Getting More from the Singular Value Decomposition (SVD): Enhance Your Models with Document, Sentence, and Term Representations
Albright, Russell; Cox, James; Ning, Jin; SAS Institute, Inc. 2016
In this presentation, using SAS code and SAS Text Miner, we compare supervised and unsupervised models with those that are based on SVD representations of subcomponents of documents.

2015 Papers

Using Boolean Rule Extraction for Taxonomic Text Categorization for Big Data
Zhao, Zheng; Albright, Russ; Cox, James; Jin, Ning; SAS Institute, Inc. 2015
This paper demonstrates how to use SAS Text Miner macros and procedures to obtain effective predictive models at all hierarchy levels in a taxonomy.

2014 Papers

Analysis of Unstructured Data: Applications of Text Analytics and Sentiment Mining
Chakraborty, Goutam; Oklahoma State University; Pagolu, Murali Krishna; SAS Institute, Inc. 2014
This paper takes a quick look at how to organize and analyze textual data for extracting insightful customer intelligence from a large collection of documents and for using such information to improve business operations and performance.
Automatic Detection of Section Membership for SAS Conference Paper Abstract Submissions: A Case Study
Chakraborty, Goutam; Oklahoma State University; Pagolu, Murali Krishna; SAS Institute, Inc. 2014
SAS Text Miner 12.1 and SAS Content Categorization Studio 12.1 is used to develop a rule-based categorization model. This model is then used to automatically score a paper abstract to identify the most relevant and appropriate conference sections to submit to for a better chance of acceptance.
How to Interpret SVD Units in Predictive Models?
Pagolu, Murali Krishna; SAS Institute Inc.; Chakraborty, Goutam Dr.; Oklahoma State University, 2014
This paper demonstrates a new and powerful feature in SAS Text Miner 12.1 which helps in explaining the SVDs or the text cluster components. Discussed also are two important methods useful to interpret them.
Processing and Storing Sparse Data in SAS Using SAS Text Miner Procedures
Zhao, Zheng; Albright, Russell; Cox, James; SAS Institute Inc., 2014
This paper demonstrates how to use SAS Text Miner procedures to process sparse data sets and generate output data sets that are easy to store and can be readily processed by traditional SAS modeling procedures.

See other Text Mining technical papers.

Resources

SAS Technical Papers » Data Mining and Text Mining

SAS Enterprise Miner

2017 Papers

2016 Papers

2015 Papers

2014 Papers

"Extra" SAS Global Forum Papers

SAS Global Forum Papers

SAS Credit Scoring

2013 Papers

2012 Papers

Data Mining

2013 Papers

2012 Papers

SAS Factory Miner

2016 Papers

Text Mining

2016 Papers

2015 Papers

2014 Papers