SAS In-Memory Statistics for Hadoop Papers A-Z

E
Session SAS2746-2016:
Everyone CAN be a Data Scientist: Using SAS® Studio to Create a Custom Task for SAS® In-Memory Statistics
SAS® In-Memory Statistics uses a powerful interactive programming interface for analytics, aimed squarely at the data scientist. We show how the custom tasks that you can create in SAS® Studio (a web-based programming interface) can make everyone a data scientist! We explain the Common Task Model of SAS Studio, and we build a simple task in steps that carries out the basic functionality of the IMSTAT procedure. This task can then be shared amongst all users, empowering everyone on their journey to becoming a data scientist. During the presentation, it will become clear that not only can shareable tasks be created but the developer does not have to understand coding in Java, JavaScript, or ActionScript. We also use the task we created in the Visual Programming perspective in SAS Studio.
Read the paper (PDF)
Stephen Ludlow, SAS
H
Session 10940-2016:
How to Move Data among Client Hard Disk, the Hadoop File System, and SAS® LASR™ Analytic Server
In SAS® LASR™ Analytic Server, data can reside in three types of environments: client hard disk (for example, a laptop), the Hadoop File System (HDFS) and the memory of the SAS LASR Analytic Server. Moving the data efficiently among these is critical for getting insights from the data on time. In this paper, we illustrate all the possible ways to move the data, including 1) moving data from client hard disk to HDFS; 2) moving data from HDFS to client hard disk; 3) moving data from HDFS to SAS LASR Analytic Server; 4) moving data from SAS LASR Analytic Server to HDFS; 5) moving data from client hard disk to SAS LASR Analytic Server; and 6) moving data from SAS LASR Analytic Server to client hard disk.
Read the paper (PDF) | Watch the recording
Yue Qi, SAS
P
Session 11780-2016:
PROC IMSTAT Boosts Knowledge Discovery in Big-Database (KDBD) in a Pharmaceutical Company
In recent years, big data has been in the limelight as a solution for business issues. Implementation of big data mining has begun in a variety of industries. The variety of data types and the velocity of increasing data have been astonishing, represented as structured data stored in a relational database or unstructured data (for example, text data, GPS data, image data, and so on). In the pharmaceutical industry, big data means real-world data such as Electronic Health Record, genomics data, medical imaging data, social network data, and so on. Handling these types of big data often requires the special environment infrastructure for statistical computing. Our presentation covers case study 1: IMSTAT implementation as a large-scale parallel computation environment; conversion from business issue to data science issue in pharma; case study 2: data handling and machine learning for vertical and horizontal big data by using PROC IMSTAT; the importance of the analysis result integration; and caution points of big data mining.
Read the paper (PDF)
Yoshitake Kitanishi, Shionogi & Co., Ltd.
Ryo Kiguchi, Shionogi & Co., Ltd.
Akio Tsuji, Shionogi & Co., Ltd.
Hideaki Watanabe, Shionogi & Co., Ltd.
U
Session 12580-2016:
Understanding the Value of SAS® Machine Learning: Using Live-Streaming Data Facilitated by SAS Streaming Analytics with Intel Technology
Machine learning is gaining popularity, fueled by the rapid advancement of computing technology. Machine learning offers the ability to continuously adjust predictive models based on real-time data, and to visualize changes and take action on the new information. Hear from two PhD's from SAS and Intel about how SAS® machine learning, working with SAS streaming analytics and Intel microprocessor technology, makes this possible.
Read the paper (PDF)
back to top