The in-database deployment
package for Hadoop includes the SAS Embedded Process and the SAS Hadoop
MapReduce JAR files. The SAS Embedded Process runs within MapReduce
to read and write data. The SAS Embedded Process runs on your Hadoop
system where the data lives.
By default, the SAS
Embedded Process install script (sasep-admin.sh) discovers the cluster
topology and installs the SAS Embedded Process on all DataNode nodes,
including the host node from where you run the script (the Hadoop
master NameNode). This occurs even if a DataNode is not present. If
you want to add the SAS Embedded Process to new nodes at a later
time, you can run the sasep-admin.sh script with the -host
<hosts> option.
For distributions that
are running MapReduce 1, the SAS Hadoop MapReduce JAR files are required
in the hadoop/lib
directory. For distributions
that are running MapReduce 2, the SAS Hadoop MapReduce JAR files are
in the EPInstallDir/SASEPHome/jars/
directory.