After you run the SAS
Deployment Manager to create the SAS Embedded Process parcel, you
must distribute and activate the parcel on the cluster. Follow these
steps:
CAUTION:
The SAS
Embedded Process must be installed on all nodes that are capable of
running a MapReduce task (MapReduce 1) or on all nodes that are capable
of running a YARN container (MapReduce 2). The SAS Embedded Process
must also be installed on the host node from which you run the script
(the Hadoop master NameNode). Hive and HCatalog must be available
on all nodes where the SAS Embedded Process is installed.
Otherwise, the SAS
Embedded Process does not function properly.
Note: More than one SAS Embedded
Process parcel can be deployed on your cluster, but only one parcel
can be activated at one time. Before activating a new parcel, deactivate
the old one.
Note: If you have licensed and
downloaded SAS Data Loader for Hadoop or SAS Contextual Analysis In-Database
Scoring for Hadoop, other SAS components are silently deployed at
the same time as the SAS Embedded Process for Hadoop. Other configuration
is required as noted in step 8.
For more information
about what components are also deployed, see Overview of the In-Database Deployment Package for Hadoop.
-
Log on to Cloudera Manager.
-
Distribute the parcel
to all nodes and create the SASEPHome directory.
-
From the menu bar, choose
HostsParcels.
The SASEP parcel
is located under your cluster. An example name is p0.1.
-
On the row for the SASEP parcel,
click Distribute to copy the parcel to all
nodes and create the SASEPHome directory.
You can log on to the
node and show the contents in the /opt/cloudera/parcel
directory.
-
Click Activate.
This step creates a
symbolic link to the SAS Hadoop JAR file.
When prompted, click Close.
-
Add the SASEP service
and create the SAS Embedded Process configuration file in HDFS.
-
Navigate to the Cloudera
Manager Home.
-
In Cloudera Manager,
select the drop-down arrow next to the name of the cluster, and then
select Add a Service.
The Add
Service Wizard page appears.
-
Select the SASEP service
and click Continue.
-
On the
Add Service WizardSelect the set
of dependencies for your new service page,
select the dependencies for the service. Click
Continue
Note: The dependencies are automatically
selected for this service.
-
On the
Add Service WizardCustomize Role
Assignments page, select a node for the
service.
Choose any node that
is part of your cluster and where HDFS is a client.
Click OK and
then click Continue. The Add
a SASEP to Cluster cluster-name page
appears.
-
Enter the name of the
HDFS user. Click Continue.
Note: The default HDFS user name
is hdfs. However, you can enter
a custom HDFS user name.
Note: If your cluster is secured
with Kerberos, the host that you select must have a valid ticket for
the HDFS user.
The ep-config.xml file
is created and added to the HDFS /sas/ep/config
directory.
This task is done in the host that you select.
-
After the SAS Embedded
Process ep-config.xml file is created, Cloudera Manager starts the
SAS Embedded Process service. This step is not required. MapReduce
is the only service that is required for the SAS Embedded Process. You
must stop the SAS Embedded Process service immediately when the task
that adds the SAS Embedded Process is finished. The
SAS Embedded Process service no longer needs to be stopped or started.
-
Verify that the ep-config.xml
file exists in the /sas/ep/config
directory
of the host that you selected in step 4e.
-
Review any additional
configuration that might be needed depending on your Hadoop distribution.
-
Validate the deployment
of the SAS Embedded Process by running a program that uses the SAS
Embedded Process and the MapReduce service. An example is a scoring
program.
-
If you have licensed
and downloaded the following SAS software, additional configuration
is required:
-
SAS Contextual Analysis In-Database
Scoring for Hadoop
For more information,
see SAS Contextual Analysis In-Database Scoring for Hadoop:
Administrator’s Guide.
-
SAS Data Loader for Hadoop
For more information,
see SAS Data Loader for Hadoop: Installation and Configuration
Guide.
-
SAS High-Performance Analytics
For more information,
see SAS High-Performance Analytics Infrastructure: Installation
and Configuration Guide.