About SAS In-Memory Statistics for Hadoop

SAS In-Memory Statistics for Hadoop is an offering that provides the data scientist or analytical expert with interactive programming access to in-memory data and integrates seamlessly with Hadoop.
In order to use the offering, the following must be true:
  • You are using a distributed SAS LASR Analytic Server only.
  • The SAS LASR Analytic Server is co-located with SAS High-Performance Deployment of Hadoop or a commercial Hadoop distribution that has been configured with the services from SAS High-Performance Deployment of Hadoop. The services enable you to use the SASHDAT file format for storing tables in HDFS.
  • SAS/ACCESS Interface to Hadoop is configured on a client machine that you use for submitting SAS programs. Be sure to install the SAS Embedded Process on the machines in the Hadoop cluster. The SAS/ACCESS engine, the embedded process, and the HDMD procedure enable you to describe your data that is in Hadoop and access it directly without an intermediate metadata repository such as Hive.
  • SAS Studio provides an interactive web-based development application that enables you to write and submit SAS programs. Make sure that your user ID is configured for passwordless SSH to the machines the cluster. Also make sure that you have passwordless SSH access from the machine that hosts SAS Studio to the machines in the cluster. For more information, see Passwordless SSH.