SAS® In-Memory Statistics uses a powerful interactive programming interface for analytics, aimed squarely at the data scientist. We show how the custom tasks that you can create in SAS® Studio (a web-based programming interface) can make everyone a data scientist! We explain the Common Task Model of SAS Studio, and we build a simple task in steps that carries out the basic functionality of the IMSTAT procedure. This task can then be shared amongst all users, empowering everyone on their journey to becoming a data scientist. During the presentation, it will become clear that not only can shareable tasks be created but the developer does not have to understand coding in Java, JavaScript, or ActionScript. We also use the task we created in the Visual Programming perspective in SAS Studio.
Stephen Ludlow, SAS
In SAS® LASR™ Analytic Server, data can reside in three types of environments: client hard disk (for example, a laptop), the Hadoop File System (HDFS) and the memory of the SAS LASR Analytic Server. Moving the data efficiently among these is critical for getting insights from the data on time. In this paper, we illustrate all the possible ways to move the data, including 1) moving data from client hard disk to HDFS; 2) moving data from HDFS to client hard disk; 3) moving data from HDFS to SAS LASR Analytic Server; 4) moving data from SAS LASR Analytic Server to HDFS; 5) moving data from client hard disk to SAS LASR Analytic Server; and 6) moving data from SAS LASR Analytic Server to client hard disk.
Yue Qi, SAS
In recent years, big data has been in the limelight as a solution for business issues. Implementation of big data mining has begun in a variety of industries. The variety of data types and the velocity of increasing data have been astonishing, represented as structured data stored in a relational database or unstructured data (for example, text data, GPS data, image data, and so on). In the pharmaceutical industry, big data means real-world data such as Electronic Health Record, genomics data, medical imaging data, social network data, and so on. Handling these types of big data often requires the special environment infrastructure for statistical computing. Our presentation covers case study 1: IMSTAT implementation as a large-scale parallel computation environment; conversion from business issue to data science issue in pharma; case study 2: data handling and machine learning for vertical and horizontal big data by using PROC IMSTAT; the importance of the analysis result integration; and caution points of big data mining.
Yoshitake Kitanishi, Shionogi & Co., Ltd.
Ryo Kiguchi, Shionogi & Co., Ltd.
Akio Tsuji, Shionogi & Co., Ltd.
Hideaki Watanabe, Shionogi & Co., Ltd.
Machine learning is gaining popularity, fueled by the rapid advancement of computing technology. Machine learning offers the ability to continuously adjust predictive models based on real-time data, and to visualize changes and take action on the new information. Hear from two PhD's from SAS and Intel about how SAS® machine learning, working with SAS streaming analytics and Intel microprocessor technology, makes this possible.