Configure SAS Plug-ins for Hadoop

Install Hadoop on the Additional Machines

Install Hadoop on the additional machines according to the documentation that is available from the vendor. Afterward, install the SAS Plug-ins for Hadoop on the additional hosts according to the SAS High-Performance Analytics Infrastructure: Installation and Configuration Guide.

(Optional) Copy a File to HDFS

If you put HDFS in safe mode at the beginning of this procedure, leave safe mode with a command that is similar to the following:
$HADOOP_HOME/bin/hdfs dfsadmin -safemode leave
To confirm that the additional machines are used, you can copy a file to HDFS and then list the locations of the blocks. Use a command that is similar to the following:
$HADOOP_HOME/bin/hadoop fs -D dfs.blocksize=512 -put /etc/fstab /hps
Note: The very small block size shown in the example is used to increase the number of blocks written and increase the likelihood that the new machines are used.
You can list the block locations with a command that is similar to the following:
$HADOOP_HOME/bin/hdfs fsck /hps/fstab -files -locations -blocks
Review the output to check for IP addresses for the new machines.
Connecting to namenode via
FSCK started by hdfs (auth:SIMPLE) from / for path /hps/fstab at Wed Jan 30 09:45:24 EST 2013
/hps/fstab 2093 bytes, 5 block(s): OK
0. BP-1250061202- len=512 repl=2 [,]
1. BP-1250061202- len=512 repl=2 [,]
2. BP-1250061202- len=512 repl=2 [,]
3. BP-1250061202- len=512 repl=2 [,]
4. BP-1250061202- len=45 repl=2 [,]
Delete the sample file:
$HADOOP_HOME/bin/hadoop fs -rm /hps/fstab