%INDHD_RUN_MODEL Syntax

%INDHD_RUN_MODEL
( INMETANAME=input-filename.SASHDMD
, SCOREPGM=model_score_program_ds2_file
<, OUTDATADIR=hdfs-directory-path>
<, OUTMETADIR=hdfs-directory-path>
<, INFILETYPE=type>
<, INPUTFILE=input-file-name>
<, OUTFILEDELMITER=file-delimiter>
<, OUTTEXTQUALIFIER=text-qualifier>
<, OUTFILETYPE=output-file-type>
<, OUTRECORDFORMAT=output-record-format>
<, FORMATFILE=user-defined-format-filename>
<, FORCEOVERWRITE=TRUE | FALSE>
<, KEEP=variable-keep-list>
<, KEEPFILENAME=keep-list-configuration-filename>
<, TRACE=YES | NO>
);
Arguments

INMETANAME=input-filename.SASHDMD

specifies the HDFS full path of the input metadata file (.sashdmd file).

Requirement The metadata file must already exist or must be generated with PROC HDMD before running the %INDHD_RUN_MODEL macro. You do not have to create a metadata file for the input data file if the data file is created with a Hadoop LIBNAME statement that contains the HDFS_DATADIR= and HDFS_METADIR options. In this instance, metadata files are automatically generated.
Interaction This file is read by the MapReduce job.
See Creating a Metadata File for the Input Data File
PROC HDMD in SAS/ACCESS for Relational Databases: Reference

SCOREPGM=model_score_program_ds2_file

specifies the name of the scoring model program file that is executed by the SAS Embedded Process.

OUTDATADIR=hdfs-directory-path

specifies the name of the HDFS directory where the MapReduce job stores the output files.

Interaction The hdfs-directory-path overrides what is specified in the <outputDir> element in the input metadata file (.sashdmd file).

OUTMETADIR=hdfs-directory-path

specifies the name of the HDFS directory where the MapReduce job stores the output file metadata.

Interaction The hdfs-directory-path overrides what is specified in the <metaDir> element in the input file metadata (.sashdmd file).

INFILETYPE=type

specifies the type of input file. type can be one of the following:

DELIMITED

specifies a delimited file.

Note This type maps to the com.sas.access.hadoop.ep.delimited.DelimitedInputFormat input format in the <epInputFormat> element in the input file metadata (.sashdmd file).

CUSTOM

specifies a custom file.

Note This type maps to the com.sas.access.hadoop.ep.custom.CustomFileInputFormat input format in the <epInputFormat> element in the input file metadata (.sashdmd file).

CUSTOM_SEQUENCE

specifies a custom sequence file.

Note This type maps to the com.sas.access.hadoop.ep.custom.CustomSequenceFileInputFormat input format in the <epInputFormat> element in the input file metadata (.sashdmd file).

SEQUENCE

specifies a sequence file.

Note This type maps to the com.sas.access.hadoop.ep.sequence.EpSequenceFileInputFormat input format in the <epInputFormat> element in the input file metadata (.sashdmd file).

BINARY

specifies a fixed record length file.

Alias FIXED
Note This type maps to thecom.sas.access.hadoop.ep.binar.FixedRecLenBinaryInputFormat input format in the <epInputFormat> element in the input file metadata (.sashdmd file).

XML

specifies an XML file.

Note This type maps to the com.sas.access.hadoop.ep.xml.XmlInputFormat input format in the <epInputFormat> element in the input file metadata (.sashdmd file).

JSON

specifies a JSON file.

Note This type maps to the com.sas.access.hadoop.ep.json.JsonInputFormat input format in the <epInputFormat> element in the input file metadata (.sashdmd file).

SPD

specifies an SPD file type.

Note This type maps to the com.sas.hadoop.ep.spd.EPSPDInputFormat input format in the <epInputFormat> element in the input file metadata (.sashdmd file).
Interaction The type overrides what is specified in the <epInputFormat> element in the input file metadata (.sashdmd file).
Note If this option is specified, the %INDHD_RUN_MODEL macro automatically matches the type with the correct input format Java class for the SAS Embedded Process. See each type for the mapping that is performed.

INPUTFILE=input-filename

specifies an HDFS fully qualified input filename. This file is read by the MapReduce job.

Interaction The input-filename overrides what is specified in the <inputDir> element in the input file metadata (.sashdmd file).

OUTFILEDELIMTER=file-delimiter

specifies the delimiter for variables (fields) in the output file. Here is how you can specify the delimiter.

  • ','
  • '\t'
  • ^A
  • ^Z
  • '09'x
  • 32
Default ^A
Range You can specify only a single character between the Unicode range of U+0001 to U+007F.
Restriction The value of this option cannot be the same character as for OUTTEXTQUALIFIER and cannot be a newline ('0a'x).
Requirement This option is valid only for DELIMITED. Other formats do not use it.
Note Valid values are 0–127, a comma (",”), or "\t".

OUTTEXTQUALIFIER=text-qualifier

specifies the text qualifier to be used in the output data file.

Default none
Range You can specify only a single character between the Unicode range of U+0001 to U+007F.
Restriction The value of this option cannot be the same character as for OUTFILEDELIMTER and cannot be a newline ('0a'x).
Requirement This option is valid only for DELIMITED. Other formats do not use it.

OUTFILETYPE=output-file-type

specifies the output file type. output-file-type can be one of the following values:

DELIMITED

specifies a delimited file.

BINARY

specifies a fixed record length file.

Alias FIXED

SPD

specifies an SPD file.

Default If the input file type is fixed, the output file type is fixed. Otherwise, it is delimited.

OUTRECORDFORMAT=output-record-format

specifies the output record format. output-record-format can be one of the following values:

DELIMITED

specifies a delimited format.

FIXED

specifies a fixed record length format.

Default DELIMITED

FORMATFILE=user-defined-format-filename

specifies the name of the user-defined formats that were created by the FORMAT procedure and that are referenced in the DATA step scoring model program.

Interaction This name is the same one that you specified in the %INDHD_PUBLISH_MODEL macro’s FMTCAT argument.
See FMTCAT=format-catalog-filename | libref.format-catalog-filename

FORCEOVERWRITE=TRUE | FALSE

specifies whether the output directory is deleted before the MapReduce job is executed.

Default FALSE

KEEP=variable-keep-list

specifies a list of variables that the SAS score program retains.

Restriction KEEP and KEEPFILENAME are mutually exclusive.
Requirement The list of variables must be separated by spaces and should not be enclosed by single or double quotation marks.

KEEPFILENAME=keep-list-configuration-filename

specifies the name of an XML configuration file that contains the list of variables that are passed to the SAS score program.

The keep list configuration file should have the following format:
<configuration>
   <property>
      <name>sas.ep.ds2.keep.list</name>
      <value>var1 var2 var3 var4... varn</value>
   </property>
</configuration>
Restriction KEEP and KEEPFILENAME are mutually exclusive.
Requirement You must specify the full path.

TRACE=YES | NO

specifies whether debug messages are displayed.

Default NO