Using SPDSCLEAN with Hadoop

SPD Server 5.2 uses a script called spdsclean_Hadoop to invoke the SPDSCLEAN functionality within a Hadoop domain. The SPDSCLEAN command for domains that are not Hadoop domains continues in force without any modifications. The spdsclean_Hadoop script accepts the same arguments as the standard SPDSCLEAN command, with the addition of some Hadoop operational parameters that must be specified. The script defines parameter values for SPD Server environment variables such as PATH, LD_LIBRARY_PATH, TKPATH, and JREOPTIONS.

Hadoop SPDSCLEAN Configuration Information

To use the SPDSCLEAN (or SPDSCLEAN2) functionality within a Hadoop domain, you must provide Hadoop config path configuration and Hadoop JAR information, in the same way that you must configure SPD Server to access Hadoop environments. You can choose from several methods to invoke the SPDSCLEAN function within a Hadoop domain.

Configure Hadoop SPDSCLEAN via Environment Variables

You can use environment variables to specify Hadoop CONFIG_PATH and JAR_PATH configuration parameters for SPDSCLEAN. You specify values for the parameters via UNIX command prompt, before submitting the spdsclean_Hadoopscript, or you can incorporate the parameter value statements within the spdsclean_Hadoop script itself.
export SAS_HADOOP_CONFIG_PATH=/u/fedadmin/hadoopcfg/cdh52p1
export SAS_HADOOP_JAR_PATH=/u/fedadmin/hadoopjars/cdh52

Configure Hadoop SPDSCLEAN via SPDSCLEAN Options

You can modify your spdsclean_Hadoop script to use the -hadoopcfg and -hadoopjaroption settings to specify the values for your Hadoop config path and JAR path.
spdsclean -hadoopcfg /u/fedadmin/hadoopcfg/cdh52p1
          -hadoopjar /u/fedadmin/hadoopjars/cdh52
          other-options;

Specify Hadoop Configuration Information for SPDSCLEAN via -libnamefile and -parmfile Options in spdsclean_Hadoop Script

You can modify your spdsclean_Hadoop script to reference Hadoop configuration information that is defined in your SPD Server libnames.parm file or in your spdsserv.parm file.
Typical content for the spdsclean_Hadoop script:
spdsclean -libnamefile libnames.parm -parmfile spdsserv.parm
.
Typical content for the libnames.parm file:
libname=Stuff1
 pathname=/user/userlname
 hadoopcfg=/u/fedadmin/hadoopcfg/cdh52p1
 hadoopjar=/u/fedadmin/hadoopjars/cdh52
 hadoop=yes;
Note: You do not need to specify Hadoop configuration information in both the libnames.parm file and in the spdsserv.parm file. Examples of both methods are provided for convenience.