simcp command
as shown in the following example:
/opt/webmin/utilbin/simsh -g newnodes mkdir -p /data/java/
/opt/webmin/utilbin/simcp -g newnodes /data/java/jdk_1.6.0_29
/data/java/sashadoop.tar.gz,
that is installed on the existing machines. Install the software with
the same user account and use the same installation path as the existing
machines.
~/addhosts.txt. When you run
the installation program, hadoopInstall,
supply the fully qualified path to the addhosts.txt file.
$HADOOP_HOME/etc/hadoop/hdfs-site.xmldfs.name.dir and dfs.data.dir properties.
$HADOOP_HOME/etc/hadoop/mapred-site.xmlmapred.system.dir and mapred.local.dir properties.
$HADOOP_HOME/etc/hadoop/slaves file
on the existing machine that is used for the NameNode. Add the host
names of the additional machines to the file.
simcp command
to copy the file and other configuration files to the new machines:
/opt/webmin/utilbin/simcp -g newnodes $HADOOP_HOME/etc/hadoop/slaves $HADOOP_HOME/etc/hadoop/ /opt/webmin/utilbin/simcp -g newnodes $HADOOP_HOME/etc/hadoop/master $HADOOP_HOME/etc/hadoop/ /opt/webmin/utilbin/simcp -g newnodes $HADOOP_HOME/etc/hadoop/core-site.xml $HADOOP_HOME/etc/hadoop/ /opt/webmin/utilbin/simcp -g newnodes $HADOOP_HOME/etc/hadoop/hdfs-site.xml $HADOOP_HOME/etc/hadoop/
ssh hostname /data/hadoop/hadoop-0.23.1/sbin/hadoop-daemon.sh start datanodehadoop-daemon.sh.
Once you have started the DataNode process on each new machine, view
the http://namenode-machine:50070/dfshealth.jsp page to
view the number of live nodes.
$HADOOP_HOME/bin/hdfs
dfsadmin -printTopology to confirm that the
new machines are part of the cluster. The following listing shows
a sample of the command output:
Rack: /default-rack
192.168.8.148:50010 (grid103.example.com)
192.168.8.153:50010 (grid104.example.com)
192.168.8.217:50010 (grid106.example.com)
192.168.8.230:50010 (grid105.example.com)
192.168.9.158:50010 (grid099.example.com)
192.168.9.159:50010 (grid100.example.com)
192.168.9.160:50010 (grid101.example.com)$HADOOP_HOME/bin/hdfs dfsadmin -safemode leave
$HADOOP_HOME/bin/hadoop fs -D dfs.blocksize=512 -put /etc/fstab /hps
$HADOOP_HOME/bin/hdfs fsck /hps/fstab -files -locations -blocks
Connecting to namenode via http://0.0.0.0:50070 FSCK started by hdfs (auth:SIMPLE) from /192.168.9.156 for path /hps/fstab at Wed Jan 30 09:45:24 EST 2013 /hps/fstab 2093 bytes, 5 block(s): OK 0. BP-1250061202-192.168.9.156-1358965928729:blk_-2796832940080983787_1074 len=512 repl=2 [192.168.8.217:50010, 192.168.8.230:50010] 1. BP-1250061202-192.168.9.156-1358965928729:blk_-7759726019690621913_1074 len=512 repl=2 [192.168.8.230:50010, 192.168.8.153:50010] 2. BP-1250061202-192.168.9.156-1358965928729:blk_-6783529658608270535_1074 len=512 repl=2 [192.168.9.159:50010, 192.168.9.158:50010] 3. BP-1250061202-192.168.9.156-1358965928729:blk_1083456124028341178_1074 len=512 repl=2 [192.168.9.158:50010, 192.168.9.160:50010] 4. BP-1250061202-192.168.9.156-1358965928729:blk_-4083651737452524600_1074 len=45 repl=2 [192.168.8.230:50010, 192.168.8.153:50010]
$HADOOP_HOME/bin/hadoop fs -rm /hps/fstab