simcp
command
as shown in the following example:
/opt/webmin/utilbin/simsh -g newnodes mkdir -p /data/java/ /opt/webmin/utilbin/simcp -g newnodes /data/java/jdk_1.6.0_29 /data/java/
sashadoop.tar.gz
,
that is installed on the existing machines. Install the software with
the same user account and use the same installation path as the existing
machines.
~/addhosts.txt
. When you run
the installation program, hadoopInstall
,
supply the fully qualified path to the addhosts.txt
file.
$HADOOP_HOME/etc/hadoop/hdfs-site.xml
dfs.name.dir
and dfs.data.dir
properties.
$HADOOP_HOME/etc/hadoop/mapred-site.xml
mapred.system.dir
and mapred.local.dir
properties.
$HADOOP_HOME/etc/hadoop/slaves
file
on the existing machine that is used for the NameNode. Add the host
names of the additional machines to the file.
simcp
command
to copy the file and other configuration files to the new machines:
/opt/webmin/utilbin/simcp -g newnodes $HADOOP_HOME/etc/hadoop/slaves $HADOOP_HOME/etc/hadoop/ /opt/webmin/utilbin/simcp -g newnodes $HADOOP_HOME/etc/hadoop/master $HADOOP_HOME/etc/hadoop/ /opt/webmin/utilbin/simcp -g newnodes $HADOOP_HOME/etc/hadoop/core-site.xml $HADOOP_HOME/etc/hadoop/ /opt/webmin/utilbin/simcp -g newnodes $HADOOP_HOME/etc/hadoop/hdfs-site.xml $HADOOP_HOME/etc/hadoop/
ssh hostname /data/hadoop/hadoop-0.23.1/sbin/hadoop-daemon.sh start datanode
hadoop-daemon.sh
.
Once you have started the DataNode process on each new machine, view
the http://namenode-machine:50070/dfshealth.jsp page to
view the number of live nodes.
$HADOOP_HOME/bin/hdfs
dfsadmin -printTopology
to confirm that the
new machines are part of the cluster. The following listing shows
a sample of the command output:
Rack: /default-rack 192.168.8.148:50010 (grid103.example.com) 192.168.8.153:50010 (grid104.example.com) 192.168.8.217:50010 (grid106.example.com) 192.168.8.230:50010 (grid105.example.com) 192.168.9.158:50010 (grid099.example.com) 192.168.9.159:50010 (grid100.example.com) 192.168.9.160:50010 (grid101.example.com)
$HADOOP_HOME/bin/hdfs dfsadmin -safemode leave
$HADOOP_HOME/bin/hadoop fs -D dfs.blocksize=512 -put /etc/fstab /hps
$HADOOP_HOME/bin/hdfs fsck /hps/fstab -files -locations -blocks
Connecting to namenode via http://0.0.0.0:50070 FSCK started by hdfs (auth:SIMPLE) from /192.168.9.156 for path /hps/fstab at Wed Jan 30 09:45:24 EST 2013 /hps/fstab 2093 bytes, 5 block(s): OK 0. BP-1250061202-192.168.9.156-1358965928729:blk_-2796832940080983787_1074 len=512 repl=2 [192.168.8.217:50010, 192.168.8.230:50010] 1. BP-1250061202-192.168.9.156-1358965928729:blk_-7759726019690621913_1074 len=512 repl=2 [192.168.8.230:50010, 192.168.8.153:50010] 2. BP-1250061202-192.168.9.156-1358965928729:blk_-6783529658608270535_1074 len=512 repl=2 [192.168.9.159:50010, 192.168.9.158:50010] 3. BP-1250061202-192.168.9.156-1358965928729:blk_1083456124028341178_1074 len=512 repl=2 [192.168.9.158:50010, 192.168.9.160:50010] 4. BP-1250061202-192.168.9.156-1358965928729:blk_-4083651737452524600_1074 len=45 repl=2 [192.168.8.230:50010, 192.168.8.153:50010]
$HADOOP_HOME/bin/hadoop fs -rm /hps/fstab