Overview of the SAS Embedded Process

The in-database deployment package for Hadoop includes the SAS Embedded Process and the SAS Hadoop MapReduce JAR files. The SAS Embedded Process runs within MapReduce to read and write data. The SAS Embedded Process runs on your Hadoop system where the data lives.
By default, the SAS Embedded Process install script (sasep-admin.sh) discovers the cluster topology and installs the SAS Embedded Process on all DataNode nodes, including the host node from where you run the script (the Hadoop master NameNode). This occurs even if a DataNode is not present. If you want to add the SAS Embedded Process to new nodes at a later time, you can run the sasep-admin.sh script with the -host <hosts> option.
For distributions that are running MapReduce 1, the SAS Hadoop MapReduce JAR files are required in the hadoop/lib directory. For distributions that are running MapReduce 2, the SAS Hadoop MapReduce JAR files are in the EPInstallDir/SASEPHome/jars/ directory.