Modify the Existing Hortonworks Data Platform Hadoop Cluster

If you want to co-locate the SAS High-Performance Analytics environment with a pre-existing Hortonworks Data Platform (HDP) cluster, you can modify your cluster with files from the SAS Plug-ins for Hadoop package. HDP modified with SAS Plug-ins for Hadoop enables the SAS High-Performance Analytics environment to write SASHDAT file blocks evenly across the HDFS file system.
  1. Log on to Ambari as an administrator, and stop all HDP services.
  2. Untar the SAS Plug-ins for Hadoop tarball, and propagate five files (identified in the following steps) on every machine in your Hortonworks Hadoop cluster:
    1. Navigate to the SAS Plug-ins for Hadoop tarball in your SAS Software depot:
      cd depot-installation-location/standalone_installs/
      SAS_Plug-ins_for_Hadoop/1_0/Linux_for_x64/
    2. Copy hdatplugins.tar.gz to a temporary location where you have Write access.
    3. Untar hdatplugins.tar.gz:
      tar xzf hdatplugins.tar.gz
      The hdatplugins directory is created.
    4. Propagate the following three JAR files in hdatplugins into the HDP library path on every machine in the HDP cluster:
      • sas.lasr.jar
      • sas.lasr.hadoop.jar
      • sas.grid.provider.yarn.jar
      Tip
      If you have already installed the SAS High-Performance Computing Management Console or the SAS High-Performance Analytics environment, you can issue a single simcp command to propagate JAR files across all machines in the cluster. For example:
           /opt/TKGrid/bin/simcp /tmp/hdatplugins/sas.lasr.jar 
           /usr/hdp/2.2.0.0-2041/hadoop/lib
           /opt/TKGrid/bin/simcp /tmp/hdatplugins/sas.lasr.hadoop.jar 
           /usr/hdp/2.2.0.0-2041/hadoop/lib
           /opt/TKGrid/bin/simcp /tmp/hdatplugins/sas.grid.provider.yarn.jar
           /usr/hdp/2.2.0.0-2041/hadoop/lib
      For more information, see Simultaneous Utilities Commands.
    5. Propagate saslasrfd in hdatplugins into the HDP bin directory on every machine in the HDP cluster. For example:
      /opt/TKGrid/bin/simcp saslasrfd /usr/hdp/2.2.0.0-2041/hadoop/bin/
  3. Propagate SAS_VERSION in hdatplugins to the $HADOOP_HOME directory on each machine in the HDP cluster.
  4. In the Ambari interface, create a custom hdfs-site.xml and add the following properties:
    dfs.namenode.plugins
    com.sas.lasr.hadoop.NameNodeService
    dfs.datanode.plugins
    com.sas.lasr.hadoop.DataNodeService
    com.sas.lasr.service.allow.put
    true
    com.sas.lasr.hadoop.service.namenode.port
    15452
    com.sas.lasr.hadoop.service.datanode.port
    15453
    dfs.namenode.fs-limits.min-block-size
    0
  5. Save the properties and restart all HDP services.
  6. If you are deploying SAS Visual Analytics, see Hadoop Configuration Step for SAS Visual Analytics.
Last updated: June 19, 2017