Overview: Enabling SPD Server with Hadoop

This chapter provides the information for the SPD Server administrator to enable SPD Server with Hadoop. To enable SPD Server with Hadoop requires only changes by the administrator. The client programs do not need to change.
To enable SPD Server with Hadoop:
  1. Verify your Hadoop environment. For more information, see Checklist to Verify the Hadoop Environment.
  2. Obtain the Hadoop distribution JAR and cluster configuration files from the Hadoop cluster. For more information, see Obtaining Hadoop Distribution JAR and Configuration Files from Hadoop Cluster.
  3. Specify the Hadoop parameter file options in the libnames.parm file and the spdsserv.parm file to make the Hadoop cluster configuration files and the Hadoop distribution JAR files available to SPD Server. For more information, see Specifying the Hadoop Parameter File Options.
  4. For a MapR Hadoop distribution, you must add information to the spdsserv.parm parameter file. On a Microsoft Windows operating system, you must add properties to configuration files. For more information, see Additional Configuration for a MapR Hadoop Distribution.
  5. On Microsoft Windows, to perform WHERE processing optimization, you must add a property to the mapred-site.xml configuration file. For more information, see Additional Configuration for Optimized WHERE Processing on Microsoft Windows.
  6. Update the rc.spds script. For more information, see Updating the rc.spds Script.
  7. Run basic tests to confirm that your Hadoop connections are working. For more information, see Validating the SPD Server to Hadoop Connection.