If you want to co-locate the SAS High-Performance Analytics environment with a pre-existing
IBM BigInsights Hadoop cluster, you can modify your cluster with files from the SAS
Plug-ins for Hadoop package. BigInsights modified with SAS Plug-ins for Hadoop enables
the SAS High-Performance Analytics environment to write SASHDAT file blocks evenly
across the
HDFS file system.
-
Untar the SAS Plug-ins
for Hadoop tarball, and propagate five files (identified in the following
steps) on every machine in your BigInsights Hadoop cluster:
-
Navigate to the SAS
Plug-ins for Hadoop tarball in your SAS Software depot:
cd depot-installation-location/standalone_installs/
SAS_Plug-ins_for_Hadoop/1_0/Linux_for_x64/
-
Copy hdatplugins.tar.gz
to a temporary location where you have Write access.
-
Untar hdatplugins.tar.gz:
tar xzf hdatplugins.tar.gz
The hdatplugins
directory
is created.
-
Propagate the following three
JAR files in
hdatplugins
into
the library path on every machine in the BigInsights cluster:
-
-
-
sas.grid.provider.yarn.jar
Note: The default location for
HADOOP_HOME is /opt/ibm/biginsights/IHC
.
The default location for BIGINSIGHTS_HOME is /opt/ibm/biginsights
.
Tip
If you have already installed
the SAS High-Performance Computing Management Console or the SAS High-Performance
Analytics environment, you can issue a single
simcp command
to propagate JAR files across all machines in the cluster. For example:
/opt/TKGrid/bin/simcp /tmp/hdatplugins/sas.lasr.jar
$HADOOP_HOME/share/hadoop/hdfs/lib
/opt/TKGrid/bin/simcp /tmp/hdatplugins/sas.lasr.hadoop.jar
$HADOOP_HOME/share/hadoop/hdfs/lib
/opt/TKGrid/bin/simcp /tmp/hdatplugins/sas.grid.provider.yarn.jar
$HADOOP_HOME/share/hadoop/hdfs/lib
For more information, see
Simultaneous Utilities Commands.
-
Propagate saslasrfd
in
hdatplugins
into the
$HADOOP_HOME/bin
directory
on every machine in the BigInsights cluster. For example:
/opt/TKGrid/bin/simcp saslasrfd $HADOOP_HOME/bin
-
Propagate SAS_VERSION
in hdatplugins
to the $HADOOP_HOME
directory
on each machine in the HDP cluster.
-
On the machine where you initially installed BigInsights, add the following properties
for SAS for the HDFS configuration to the file
$BIGINSIGHTS_HOME/hdm/hadoop-conf-staging/hdfs-site.xml
.
Adjust values appropriately for your deployment:
<property>
<name>dfs.namenode.plugins</name>
<value>com.sas.lasr.hadoop.NameNodeService</value>
</property>
<property>
<name>dfs.datanode.plugins</name>
<value>com.sas.lasr.hadoop.DataNodeService</value>
</property>
<property>
<name>com.sas.lasr.service.allow.put</name>
<value>true</value>
</property>
<property>
<name>com.sas.lasr.hadoop.service.namenode.port</name>
<value>15452</value>
</property>
<property>
<name>com.sas.lasr.hadoop.service.datanode.port</name>
<value>15453</value>
</property>
<property>
<name> dfs.namenode.fs-limits.min-block-size</name>
<value>0</value>
</property>
-
Synchronize this new
configuration by running the following command on the machine where
you initially deployed BigInsights: $BIGINSIGHTS_HOME/bin/syncconf.sh
.
-
On the machine where
you initially deployed BigInsights, log on as the biadmin user and
run the following commands to restart the cluster with the new configuration:
stop-all.sh
start-all.sh
-
Note the location of
HADOOP_HOME. You will need to refer to this value when installing
the SAS High-Performance Analytics environment.
-