The configuration for
the Pig transformation is simple. Open the
Hadoop Options tab
and select the
Delete outputs before executing hadoop
statements check box.
The configuration needed
for the Pig transformation varies from job to job. This sample job
requires that you add three Pig Latin statements and four substitution
parameters on the
Pig Latin tab.
The tab is shown in
the following display:
Pig Latin Tab
Note that the Pig Latin
statements are entered in the
Pig Latin field.
The sample job contains the following statements:
A = load '/user/test/PIG/$inputfilename' USING PigStorage(',')
AS (f1:int,f2:int,f3:int);
B1 = filter A by $filtercolumn == $filtervalue;
store B1 into '/user/test/PIG/$outputfilename' USING PigStorage(',');
Similarly, the substitution parameters are entered
in the
Substitution parameters field, as
follows:
Name = inputfilename, Value = numbers_target.txt,
Description = This is the name of the file loaded into hadoop
Name = outputfilename, Value = &output, Description = Output filename
Name = filtervalue, Value = 5
Name = filtercolumn, Value = f3
You also need to open
the
Hadoop Options tab to select the
Delete
outputs before executing hadoop statements check box.
Then, enter appropriate
code into the
Hadoop pre-process code field,
as shown in the following display:
Hadoop Options in the Pig Transformation
Finally, you need to
create a new prompt for the substitution parameter on the
Parameters tab.
The general values for this job are
Name=output
and
Displayed
text=Pig target
. The prompt type and values are
Prompt
type=Text
and
Default value=PIG_SubstitutionParamtarget.txt
.
The following display
shows the completed
Parameters tab:
Parameters Tab