HADOOP Procedure

MAPREDUCE Statement

Submits MapReduce programs into a Hadoop cluster.
Submitting a MapReduce Program

Syntax

MapReduce Options

COMBINE=class-name
specifies the name of the combiner class in dot notation.
DELETERESULTS
deletes the MapReduce results.
GROUPCOMPARE=class-name
specifies the name of the grouping comparator (GroupComparator) class in dot notation.
INPUT=HDFS-path
specifies the HDFS path to the MapReduce input file.
INPUTFORMAT=class-name
specifies the name of the input format class in dot notation.
JAR='external-file(s)'
specifies the locations of the JAR files that contain the MapReduce program and named classes. Include the complete pathname and the filename. Enclose each location in single or double quotation marks.
MAP=class-name
specifies the name of the map class in dot notation. A map class contains elements that are formed by the combination of a key value and a mapped value.
OUTPUT=HDFS-path
specifies a new HDFS path for the MapReduce output.
OUTPUTFORMAT=class-name
specifies the name of the output format class in dot notation.
OUTPUTKEY=class-name
specifies the name of the output key class in dot notation.
OUTPUTVALUE=class-name
is the name of the output value class in dot notation.
PARTITIONER=class-name
specifies the name of the partitioner class in dot notation. A partitioner class controls the partitioning of the keys of the intermediate map-outputs.
REDUCE=class-name
specifies the name of the reducer class in dot notation. The reduce class reduces a set of intermediate values that share a key to a smaller set of values.
REDUCETASKS=integer
specifies the number of reduce tasks.
SORTCOMPARE=class-name
specifies the name of the sort comparator class in dot notation.
WORKINGDIR=HDFS-path
specifies the name of the HDFS working directory path.