To install the SAS Embedded
Process and SAS Hadoop MapReduce JAR files, follow these steps:
Note: Permissions
are needed to install the SAS Embedded Process and SAS Hadoop MapReduce
JAR files.
For
more information, see Hadoop Permissions.
-
Navigate to the location
on your Hadoop master node where you copied the en_sasexe.zip file.
cd /EPInstallDir
-
Ensure that both the
EPInstallDir folder
and the en_sasexe.zip file have Read, Write, and Execute permissions
(
chmod 777 —R
).
-
Unzip the en_sasexe.zip
file.
unzip en_sasexe.zip
After the file is unzipped,
a
sasexe
directory is created in the
same location as the en_sasexe.zip file. The sepcorehadp-9.43000-1.sh
file is in the
sasexe
directory.
EPInstallDir/sasexe/sepcorehadp-9.43000-1.sh
-
Use the following command
to unpack the sepcorehadp-9.43000-1.sh file.
./sepcorehadp-9.43000-1.sh
After this script is
run and the files are unpacked, the script creates the following directory
structure where
EPInstallDir is
the location on the master node from Step 2.
EPInstallDir/sasexe/SASEPHome
EPInstallDir/sasexe/sepcorehadp-9.43000-1.sh
Note: During the install process,
the sepcorehadp-9.43000-1.sh is copied to all data nodes. Do not remove
or move this file from the EPInstallDir/sasexe
directory.
The SASEPHome directory
structure should look like this.
EPInstallDir/sasexe/SASEPHome/bin
EPInstallDir/sasexe/SASEPHome/misc
EPInstallDir/sasexe/SASEPHome/sasexe
EPInstallDir/sasexe/SASEPHome/utilities
EPInstallDir/sasexe/SASEPHome/jars
The
EPInstallDir/sasexe/SASEPHome/jars
directory contains the SAS Hadoop MapReduce JAR files.
EPInstallDir/sasexe/SASEPHome/jars/sas.hadoop.ep.apache023.jar
EPInstallDir/sasexe/SASEPHome/jars/sas.hadoop.ep.apache023.nls.jar
EPInstallDir/sasexe/SASEPHome/jars/sas.hadoop.ep.apache121.jar
EPInstallDir/sasexe/SASEPHome/jars/sas.hadoop.ep.apache121.nls.jar
EPInstallDir/sasexe/SASEPHome/jars/sas.hadoop.ep.apache205.jar
EPInstallDir/sasexe/SASEPHome/jars/sas.hadoop.ep.apache205.nls.jar
The
EPInstallDir/sasexe/SASEPHome/bin
directory should look similar to this.
EPInstallDir/sasexe/SASEPHome/bin/sasep-admin.sh
-
Use the sasep-admin.sh
script to deploy the SAS Embedded Process installation across all
nodes.
This is when the sepcorehadp-9.43000-1.sh
file is copied to all data nodes.
Tip
Many
options are available for installing the SAS Embedded Process. We
recommend that you review the script syntax before running it.
For more information,
see SASEP-ADMIN.SH Script.
Note: If your cluster is secured
with Kerberos, complete both steps a and b. If your cluster is not
secured with Kerberos, complete only step b.
-
If your cluster is secured
with Kerberos, the HDFS user must have a valid Kerberos ticket to
access HDFS. This can be done with kinit.
sudo su - root
su - hdfs | hdfs-userid
kinit -kt location of keytab file user for which you are requesting a ticket
exit
Note: For all Hadoop distributions
except MapR, the default HDFS user is hdfs
.
For MapR distributions, the default HDFS user is mapr
.
You can specify a different user ID with the -hdfsuser argument when
you run the sasep-admin.sh -add
script.
Note: To check the status of your
Kerberos ticket on the server run klist while you are running as the
-hdfsuser user. Here is an example:
klist
Ticket cache: FILE/tmp/krb5cc_493
Default principal: hdfs@HOST.COMPANY.COM
Valid starting Expires Service principal
06/20/15 09:51:26 06/27/15 09:51:26 krbtgt/HOST.COMPANY.COM@HOST.COMPANY.COM
renew until 06/22/15 09:51:26
-
Run the sasep-admin.sh
script. Review all of the information in this step before running
the script.
cd EPInstallDir/sasexe/SASEPHome/bin/
./sasep-admin.sh -add
Note: The sasep-admin.sh script
must be run from the EPInstallDir/sasexe/SASEPHome/bin/
location.
Tip
There
are many options available when installing the SAS Embedded Process.
We recommend that you review the script syntax before running it.
For more information,
see SASEP-ADMIN.SH Script.
Note: By default, the SAS Embedded
Process install script (sasep-admin.sh) discovers the cluster topology
and installs the SAS Embedded Process on all DataNode nodes, including
the host node from where you run the script (the Hadoop master NameNode).
This occurs even if a DataNode is not present. If you want to add
the SAS Embedded Process to new nodes at a later time, you should
run the sasep-admin.sh script with the -host
<hosts>
option.
-
Verify that the SAS
Embedded Process is installed by running the sasep-admin.sh script
with the
-check
option.
cd EPInstallDir/sasexe/SASEPHome/bin/
./sasep-admin.sh -check
This command checks
if the SAS Embedded Process is installed on all data nodes.
Note: The sasep-admin.sh -check
script does not run successfully if the SAS Embedded Process is not
installed.
-
If your distribution
is running MapReduce 1 or your SAS client is running on the second
maintenance release for SAS 9.4, follow these steps. Otherwise, skip
to Step 8.
-
Verify that the sas.hadoop.ep.apache*.jar
files are now in the
hadoop/lib
directory.
For Cloudera, the JAR
files are typically located here:
/opt/cloudera/parcels/CDH/lib/hadoop/lib
For
Hortonworks, the JAR files are typically located here:
/usr/lib/hadoop/lib
-
Restart the Hadoop MapReduce
service.
This enables the cluster
to load the SAS Hadoop MapReduce JAR files (sas.hadoop.ep.*.jar).
Note: It is preferable to restart
the service by using Cloudera Manager or Ambari (for Hortonworks),
if available.
-
Verify that the configuration
file, ep-config.xml, was written to the HDFS file system.
hadoop fs -ls /sas/ep/config
Note: If your cluster is secured
with Kerberos, you need a valid Kerberos ticket to access HDFS. If
not, you can use the WebHDFS browser.
Note: The /sas/ep/config
directory
is created automatically when you run the install script. If you used
the -epconfig or -genconfig to specify a non-default location, use
that location to find the ep-config.xml file.