Deploying the SAS Embedded Process Parcel on Cloudera

After you run the SAS Deployment Manager to create the SAS Embedded Process parcel, you must distribute and activate the parcel on the cluster. Follow these steps:
CAUTION:
The SAS Embedded Process must be installed on all nodes that are capable of running a MapReduce task (MapReduce 1) or on all nodes that are capable of running a YARN container (MapReduce 2). The SAS Embedded Process must also be installed on the host node from which you run the script (the Hadoop master NameNode). Hive and HCatalog must be available on all nodes where the SAS Embedded Process is installed.
Otherwise, the SAS Embedded Process does not function properly.
Note: More than one SAS Embedded Process parcel can be deployed on your cluster, but only one parcel can be activated at one time. Before activating a new parcel, deactivate the old one.
Note: If you have licensed and downloaded SAS Data Loader for Hadoop or SAS Contextual Analysis In-Database Scoring for Hadoop, other SAS components are silently deployed at the same time as the SAS Embedded Process for Hadoop. Other configuration is required as noted in step 8. For more information about what components are also deployed, see Overview of the In-Database Deployment Package for Hadoop.
  1. Log on to Cloudera Manager.
  2. Distribute the parcel to all nodes and create the SASEPHome directory.
    1. From the menu bar, choose Hoststhen selectParcels.
      The SASEP parcel is located under your cluster. An example name is p0.1.
    2. On the row for the SASEP parcel, click Distribute to copy the parcel to all nodes and create the SASEPHome directory.
      You can log on to the node and show the contents in the /opt/cloudera/parcel directory.
  3. Click Activate.
    This step creates a symbolic link to the SAS Hadoop JAR file.
    When prompted, click Close.
  4. Add the SASEP service and create the SAS Embedded Process configuration file in HDFS.
    1. Navigate to the Cloudera Manager Home.
    2. In Cloudera Manager, select the drop-down arrow next to the name of the cluster, and then select Add a Service.
      The Add Service Wizard page appears.
    3. Select the SASEP service and click Continue.
    4. On the Add Service Wizardthen selectSelect the set of dependencies for your new service page, select the dependencies for the service. Click Continue
      Note: The dependencies are automatically selected for this service.
    5. On the Add Service Wizardthen selectCustomize Role Assignments page, select a node for the service.
      Choose any node that is part of your cluster and where HDFS is a client.
      Click OK and then click Continue. The Add a SASEP to Cluster cluster-name page appears.
    6. Enter the name of the HDFS user. Click Continue.
      Note: The default HDFS user name is hdfs. However, you can enter a custom HDFS user name.
      Note: If your cluster is secured with Kerberos, the host that you select must have a valid ticket for the HDFS user.
      The ep-config.xml file is created and added to the HDFS /sas/ep/config directory. This task is done in the host that you select.
    7. After the SAS Embedded Process ep-config.xml file is created, Cloudera Manager starts the SAS Embedded Process service. This step is not required. MapReduce is the only service that is required for the SAS Embedded Process. You must stop the SAS Embedded Process service immediately when the task that adds the SAS Embedded Process is finished. The SAS Embedded Process service no longer needs to be stopped or started.
  5. Verify that the ep-config.xml file exists in the /sas/ep/config directory of the host that you selected in step 4e.
  6. Review any additional configuration that might be needed depending on your Hadoop distribution.
  7. Validate the deployment of the SAS Embedded Process by running a program that uses the SAS Embedded Process and the MapReduce service. An example is a scoring program.
  8. If you have licensed and downloaded the following SAS software, additional configuration is required:
    • SAS Contextual Analysis In-Database Scoring for Hadoop
      For more information, see SAS Contextual Analysis In-Database Scoring for Hadoop: Administrator’s Guide.
    • SAS Data Loader for Hadoop
      For more information, see SAS Data Loader for Hadoop: Installation and Configuration Guide.
    • SAS High-Performance Analytics
      For more information, see SAS High-Performance Analytics Infrastructure: Installation and Configuration Guide.
Last updated: February 9, 2017