SAS High Performance Analytics Papers A-Z

B
Session SAS2140-2016:
Best Practices for Resource Management in Hadoop
SAS® solutions that run in Hadoop provide you with the best tools to transform data in Hadoop. They also provide insights to help you make the right decisions for your business. It is possible to incorporate SAS products and solutions into your shared Hadoop cluster in a cooperative fashion with YARN to manage the resources. Best practices and customer examples are provided to show how to build and manage a shared cluster with SAS applications and products.
Read the paper (PDF) | Watch the recording
James Kochuba, SAS
D
Session 11670-2016:
Do SAS® High-Performance Statistics Procedures Really Perform Highly? A Comparison of HP and Legacy Procedures
The new SAS® High-Performance Statistics procedures were developed to respond to the growth of big data and computing capabilities. Although there is some documentation regarding how to use these new high-performance (HP) procedures, relatively little has been disseminated regarding under what specific conditions users can expect performance improvements. This paper serves as a practical guide to getting started with HP procedures in SAS®. The paper describes the differences between key HP procedures (HPGENSELECT, HPLMIXED, HPLOGISTIC, HPNLMOD, HPREG, HPCORR, HPIMPUTE, and HPSUMMARY) and their legacy counterparts both in terms of capability and performance, with a particular focus on discrepancies in real time required to execute. Simulations were conducted to generate data sets that varied on the number of observations (10,000, 50,000, 100,000, 500,000, 1,000,000, and 10,000,000) and the number of variables (50, 100, 500, and 1,000) to create these comparisons.
Read the paper (PDF) | View the e-poster or slides (PDF)
Diep Nguyen, University of South Florida
Sean Joo, University of South Florida
Anh Kellermann, University of South Florida
Jeff Kromrey, University of South Florida
Jessica Montgomery, University of South Florida
Patricia Rodríguez de Gil, University of South Florida
Yan Wang, University of South Florida
L
Session SAS4400-2016:
Leveraging Advanced Analytics in Pricing and Inventory Decisions at a Major Durable Goods Company
As a result of globalization, the durable goods market has become increasingly competitive, with market conditions that challenge profitability for manufacturers. Moreover, high material costs and the capital-intensive nature of the industry make it essential that companies understand demand signals and utilize supply chain capacity as effectively as possible. To grow and increase profitability under these challenging market conditions, a major durable goods company has partnered with SAS to streamline analysis of pricing and profitability, optimize inventory, and improve service levels to its customers. The price of a product is determined by a number of factors, such as the strategic importance of customers, supply chain costs, market conditions, and competitive prices. Offering promotions is an important part of a marketing strategy; it impacts purchasing behaviors of business customers and end consumers. This paper describes how this company developed a system to analyze product profitability and the impact of promotion on purchasing behaviors of both their business customers and end consumers. This paper also discusses how this company uses integrated demand planning and inventory optimization to manage its complex multi-echelon supply chain. The process uses historical order data to create a statistical forecast of demand, and then optimizes inventory across the supply chain to satisfy the forecast at desired service levels.
Read the paper (PDF)
VARUNRAJ VALSARAJ, SAS
Bahadir Aral, SAS Institute Inc
Baris Kacar, SAS Institute Inc
Jinxin Yi, SAS Institute Inc
M
Session SAS6344-2016:
Mass-Scale, Automated Machine Learning and Model Deployment Using SAS® Factory Miner and SAS® Decision Manager
Business problems have become more stratified and micro-segmentation is driving the need for mass-scale, automated machine learning solutions. Additionally, deployment environments include diverse ecosystems, requiring hundreds of models to be built and deployed quickly via web services to operational systems. The new SAS® automated modeling tool allows you to build and test hundreds of models across all of the segments in your data, testing a wide variety of machine learning techniques. The tool is completely customizable, allowing you transparent access to all modeling results. This paper shows you how to identify hundreds of champion models using SAS® Factory Miner, while generating scoring web services using SAS® Decision Manager. Immediate benefits include efficient model deployments, which allow you to spend more time generating insights that might reveal new opportunities, expose hidden risks, and fuel smarter, well-timed decisions.
Read the paper (PDF)
Jonathan Wexler, SAS
Steve Sparano, SAS
U
Session SAS5244-2016:
Unleashing High-Performance Risk Data with the Hadoop Custom File Reader
SAS® High-Performance Risk distributes financial risk data and big data portfolios with complex analyses across a networked Hadoop Distributed File System (HDFS) grid to support rapid in-memory queries for hundreds of simultaneous users. This data is extremely complex and must be stored in a proprietary format to guarantee data affinity for rapid access. However, customers still desire the ability to view and process this data directly. This paper demonstrates how to use the HPRISK custom file reader to directly access risk data in Hadoop MapReduce jobs, using the HPDS2 procedure and the LASR procedure.
Read the paper (PDF) | Download the data file (ZIP)
Mike Whitcher, SAS
Stacey Christian, SAS
Phil Hanna, SAS Institute
Don McAlister, SAS
W
Session 9960-2016:
Working with Big Data in Near Real Time Using SAS® Event Stream Processing
In the world of big data, real-time processing and event stream processing are becoming the norm. However, there are not many tools available today that can do this type of processing. SAS® Event Stream Processing aims to process this data. In this paper, we look at using SAS Event Stream Processing to read multiple data sets stored in big data platforms such as Hadoop and Cassandra in real time and to perform transformations on the data such as joining data sets, filtering data based on preset business rules, and creating new variables as required. We look at how we can score the data based on a machine learning algorithm. This paper shows you how to use the provided Hadoop Distributed File System (HDFS) publisher and subscriber to read and push data to Hadoop. The HDFS adapter is discussed in detail. We look at the Streamviewer to see how data flows through SAS Event Stream Processing.
View the e-poster or slides (PDF)
Krishna Sai Kishore Konudula, Kavi Associates
back to top