Modify the Existing IBM BigInsights Hadoop Cluster

If you want to co-locate the SAS High-Performance Analytics environment with a pre-existing IBM BigInsights Hadoop cluster, you can modify your cluster with files from the SAS Plug-ins for Hadoop package. BigInsights modified with SAS Plug-ins for Hadoop enables the SAS High-Performance Analytics environment to write SASHDAT file blocks evenly across the HDFS file system.
  1. Untar the SAS Plug-ins for Hadoop tarball, and propagate five files (identified in the following steps) on every machine in your BigInsights Hadoop cluster:
    1. Navigate to the SAS Plug-ins for Hadoop tarball in your SAS Software depot:
      cd depot-installation-location/standalone_installs/
      SAS_Plug-ins_for_Hadoop/1_0/Linux_for_x64/
    2. Copy hdatplugins.tar.gz to a temporary location where you have Write access.
    3. Untar hdatplugins.tar.gz:
      tar xzf hdatplugins.tar.gz
      The hdatplugins directory is created.
    4. Propagate the following three JAR files in hdatplugins into the library path on every machine in the BigInsights cluster:
      • sas.lasr.jar
      • sas.lasr.hadoop.jar
      • sas.grid.provider.yarn.jar
      Note: The default location for HADOOP_HOME is /opt/ibm/biginsights/IHC. The default location for BIGINSIGHTS_HOME is /opt/ibm/biginsights.
      Tip
      If you have already installed the SAS High-Performance Computing Management Console or the SAS High-Performance Analytics environment, you can issue a single simcp command to propagate JAR files across all machines in the cluster. For example:
           /opt/TKGrid/bin/simcp /tmp/hdatplugins/sas.lasr.jar
           $HADOOP_HOME/share/hadoop/hdfs/lib
           /opt/TKGrid/bin/simcp /tmp/hdatplugins/sas.lasr.hadoop.jar
           $HADOOP_HOME/share/hadoop/hdfs/lib
           /opt/TKGrid/bin/simcp /tmp/hdatplugins/sas.grid.provider.yarn.jar
           $HADOOP_HOME/share/hadoop/hdfs/lib
       
      For more information, see Simultaneous Utilities Commands.
    5. Propagate saslasrfd in hdatplugins into the $HADOOP_HOME/bin directory on every machine in the BigInsights cluster. For example:
      /opt/TKGrid/bin/simcp saslasrfd $HADOOP_HOME/bin
  2. Propagate SAS_VERSION in hdatplugins to the $HADOOP_HOME directory on each machine in the HDP cluster.
  3. On the machine where you initially installed BigInsights, add the following properties for SAS for the HDFS configuration to the file $BIGINSIGHTS_HOME/hdm/hadoop-conf-staging/hdfs-site.xml. Adjust values appropriately for your deployment:
    <property>
    <name>dfs.namenode.plugins</name>
    <value>com.sas.lasr.hadoop.NameNodeService</value>
    </property>
    <property>
    <name>dfs.datanode.plugins</name>
    <value>com.sas.lasr.hadoop.DataNodeService</value>
    </property>
    <property>
    <name>com.sas.lasr.service.allow.put</name>
    <value>true</value>
    </property>
    <property>
    <name>com.sas.lasr.hadoop.service.namenode.port</name>
    <value>15452</value>
    </property>
    <property>
    <name>com.sas.lasr.hadoop.service.datanode.port</name>
    <value>15453</value>
    </property>
    <property>
    <name> dfs.namenode.fs-limits.min-block-size</name>
    <value>0</value>
    </property>
    
  4. Synchronize this new configuration by running the following command on the machine where you initially deployed BigInsights: $BIGINSIGHTS_HOME/bin/syncconf.sh.
  5. On the machine where you initially deployed BigInsights, log on as the biadmin user and run the following commands to restart the cluster with the new configuration:
    stop-all.sh
    start-all.sh
  6. Note the location of HADOOP_HOME. You will need to refer to this value when installing the SAS High-Performance Analytics environment.
  7. If you are deploying SAS Visual Analytics, see Hadoop Configuration Step for SAS Visual Analytics.
Last updated: June 19, 2017