Overview of the In-Database Deployment Package for Hadoop

The in-database deployment package for Hadoop must be installed and configured on your Hadoop cluster before you can perform the following tasks:
  • Run a scoring model in Hadoop Distributed File System (HDFS) using the SAS Scoring Accelerator for Hadoop.
    For more information about using the scoring publishing macros, see the SAS In-Database Products: User's Guide.
  • Run DATA step scoring programs in Hadoop.
    For more information, see the SAS In-Database Products: User's Guide.
  • Run DS2 threaded programs in Hadoop using the SAS In-Database Code Accelerator for Hadoop.
    For more information, see the SAS In-Database Products: User's Guide.
  • Perform data quality operations in Hadoop, transform data in Hadoop, and extract transformed data out of Hadoop for analysis in SAS using the SAS Data Loader for Hadoop.
    For more information, see SAS Data Loader for Hadoop: User’s Guide.
    Note: If you are installing the SAS Data Loader for Hadoop, you must perform additional steps after you install the in-database deployment package for Hadoop. For more information, see Part 3, “Administrator’s Guide for SAS Data Loader for Hadoop”.
  • Read and write data to HDFS in parallel for SAS High-Performance Analytics.
    Note: For deployments that use SAS High-Performance Deployment of Hadoop for the co-located data provider, and access SASHDAT tables exclusively, SAS/ACCESS and SAS Embedded Process are not needed.
    Note: If you are installing the SAS High-Performance Analytics environment, you must perform additional steps after you install the SAS Embedded Process. For more information, see SAS High-Performance Analytics Infrastructure: Installation and Configuration Guide.