Some solutions, such
as SAS Visual Analytics, rely on a SAS data store that is co-located
with the SAS High-Performance Analytics environment on the analytics
cluster.
The following figure
shows the analytics environment co-located on a pre-existing Hadoop
cluster. In this document, the NameNode is deployed on blade 0.
The SAS Plug-ins for Hadoop component provides services to your supported pre-existing
Hadoop distribution that enable the SAS High-Performance Analytics environment to
write SASHDAT file blocks evenly across the
HDFS file system. This even distribution provides a balanced workload across the machines
in the cluster and enables SAS analytic processes to read SASHDAT tables at very impressive
rates.
With the exception of
MapR Hadoop, modifying your supported Hadoop cluster for the analytics
environment consists of the following steps:
-
Make sure that
your
Hadoop
distribution meets the requirements.
-
Copy the SAS Plug-ins
for Hadoop package to a temporary location and untar it.
-
Propagate several
files from the package to specific Hadoop directories on every machine
in your cluster.
-
Restart the HDFS
service and any dependencies.
The SAS High-Performance
Analytics infrastructure supports the following Hadoop distributions:
For more information
about SAS and Hadoop, see: