Support for Hadoop
Details about the minimum supported versions for Hadoop distributions and Kerberos are provided in the following table.
Unless noted otherwise:
- SAS software listed below is for SAS 9.4 (9.4 M8) with the latest hot fixes applied.
- SAS 9.4M8 and later provide Limited support for Cloudera CDH and Hortonworks HDP. Per the Cloudera Support lifecycle policy, these offerings are no longer supported.
- SAS/ACCESS Interface to Hadoop on SAS 9.4M8 supports Cloudera Data Platform (CDP) 7.2 for Public Cloud and CDP 7.1 with CDP Private Cloud Data Services 1.5.0 for Private Cloud. Bulk loading (BULKLOAD=YES) is not supported in Private Cloud.
- The HADOOPPLATFORM=SPARK option requires Spark 2.4. This option is available for use with SAS In-Database Code Accelerator for Hadoop, SAS Scoring Accelerator for Hadoop, DATA Step Processing in Hadoop, and SAS High-Performance Analytics.
- Read SAS Support for Alternative Releases of Hadoop Distributions below to understand SAS support for later versions of Hadoop distributions.
- Information about Hadoop JAR files and SAS environment variables for Hadoop is provided in the SAS 9.4 Hadoop Configuration Guide for Base SAS and SAS/ACCESS. Multiple editions of this guide are available.
SAS Products, Offerings, and Technologies | CDP 7.1 (Private Cloud) [7] [9] [14] | CDP 7.2 (Public Cloud) | MapR Distribution 6.2 | Amazon EMR 5.3 and later within 5.x [13] | Amazon EMR 6.x [13] | Microsoft Azure HDInsight 4.0 |
---|---|---|---|---|---|---|
Base SAS: FILENAME Statement for Hadoop Access Method | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Base SAS: HADOOP Procedure | ✓[4] | ✓ | ✓ | ✓ | ✓ | ✓ |
Base SAS: SQOOP Procedure | ✓[4] | Not Supported | ✓ | Not Supported | Not Supported | Not Supported |
SAS Scalable Performance Data Engine | ✓[4] [9] | ✓ | ✓ | Not Supported | Not Supported | Not Supported |
SAS Scalable Performance Data Engine SerDe | ✓[4] [9] | ✓ | ✓ [12] | Not Supported | Not Supported | Not Supported |
SAS/ACCESS Interface to Hadoop | ✓ [14] | ✓ | ✓ | ✓ | ✓ | ✓ |
SAS/ACCESS Interface to Spark | Not Supported | ✓ | Not Supported | Not Supported | Not Supported | Not Supported |
SAS In-Database Code Accelerator for Hadoop | ✓[4][5][6] | ✓ | ✓ | ✓ [13] | ✓ [13] | Not Supported |
SAS Scoring Accelerator for Hadoop | ✓[4][5][6] | ✓ | ✓ | ✓ [13] | ✓ [13] | Not Supported |
DATA Step Processing in Hadoop | ✓[4] [10] | ✓ | ✓ | ✓ | ✓ | Not Supported |
SAS Grid Manager for Hadoop | ✓[2] [4] | ✓ [2] | ✓ [3] | Not Supported | Not Supported | Not Supported |
SAS High-Performance Analytics Environment | ✓[4][5][6] | ✓ | ✓ | Not Supported | Not Supported | Not Supported |
SAS LASR Analytic Server [1] | ✓[4] | ✓ | ✓ | Not Supported | Not Supported | Not Supported |
SAS Data Loader for Hadoop | ✓[7] [8] [11] | Not Supported | Not Supported | Not Supported | Not Supported | Not Supported |
[1] In addition, Apache Hadoop 0.23, 2.4.0, and 2.7.1 and later versions are supported as the Hadoop cluster that is co-located with SAS LASR Analytic Serverfor access to SASHDAT on HDFS.
[2] Includes a REST API for job submission that results in better performance.
[3] Includes support for cron syntax in coordinator frequency with Oozie. This functionality is needed to schedule flows with recurring time events.
[4] On a public cloud, HDFS operations through the REST API are supported.
[5] Supported only with the HADOOPPLATFORM=SPARK option.
[6] Version 18 or later of the SAS Embedded Process for Hadoop is required.
[7] The latest SAS hot fixes are required.
[8] Version 19 or later of the SAS Embedded Process for Hadoop is required.
[9] Not supported for use in a public cloud environment because of the limitations of org.apache.hadoop.fs.FSDataOutputStream.
[10] Not supported for use in a public cloud environment.
[11] “And later” versions of CDP are not supported.
[12] Support not available at the initial release of SAS 9.4M8.
[13] For in-database processing in Amazon EMR, SAS must be deployed in Amazon Web Services (AWS).
[14] Only SAS/ACCESS Interface to Hadoop supports CDP 7.1 with CDP Private Cloud Data Services 1.5.0.
Hadoop Security
Details about the supported Hadoop Security configurations
SAS Products, Offerings, and Technologies | Kerberos [2] | Kerberos via REST API [2] | Sentry | Knox | Ranger | HDFS Encryption |
---|---|---|---|---|---|---|
Base SAS: FILENAME Statement for Hadoop Access Method | ✓ | ✓ | n/a | ✓ | n/a | ✓ |
Base SAS: HADOOP Procedure | ✓ | Not Supported | n/a | ✓ | n/a | ✓ |
Base SAS: SQOOP Procedure | ✓ | ✓ | ✓ | Not Supported | ✓ | ✓ |
SAS Scalable Performance Data Engine | ✓ | Not Supported | ✓ | Not Supported | ✓ | ✓ |
SAS Scalable Performance Data Engine SerDe | ✓ | Not Supported | ✓ | Not Supported | ✓ | ✓ |
SAS/ACCESS Interface to Hadoop | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
SAS/ACCESS Interface to Spark | ✓ | ✓ | Not Supported | Not Supported | Not Supported | Not Supported |
SAS/ACCESS Interface to Impala | ✓ | ✓ | ✓ | n/a | n/a | ✓ |
SAS In-Database Code Accelerator for Hadoop | ✓ | ✓ | ✓ | Not Supported | ✓ | ✓ |
SAS Scoring Accelerator for Hadoop | ✓ | ✓ | ✓ | Not Supported | ✓ | ✓ |
DATA Step Processing in Hadoop | ✓ | ✓ | Not Supported | Not Supported | Not Supported | ✓ |
SAS Grid Manager for Hadoop | ✓ | ✓ | Not Supported | Not Supported | Not Supported | Not Supported |
SAS High-Performance Analytics Environment | ✓ | ✓ | ✓ | Not Supported | ✓ | ✓ |
SAS LASR Analytic Server | ✓ | Not Supported | Not Supported | Not Supported | Not Supported | Not Supported |
SAS Data Loader for Hadoop | ✓[1] | ✓[1] | ✓ | Not Supported | Not Supported | ✓ |
Footnotes:
[1] MapR security and Kerberos are not supported
[2] Kerberos not supported if connecting to MapR from Windows host
Hadoop High Availability
SAS currently supports HDFS HA and Hive HA with Cloudera distributions.
SAS Products, Offerings, and Technologies | HDFS HA | HIVE HA |
---|---|---|
Base SAS: FILENAME Statement for Hadoop Access Method | ✓ | n/a |
Base SAS: HADOOP Procedure | ✓ | n/a |
Base SAS: SQOOP Procedure | ✓ | ✓ |
SAS Scalable Performance Data Engine | ✓ | n/a |
SAS Scalable Performance Data Engine SerDe | ✓ | ✓ |
SAS/ACCESS Interface to Hadoop | ✓ | ✓ |
SAS/ACCESS Interface to Spark | ✓ | n/a |
SAS/ACCESS Interface to Impala | ✓ | n/a |
SAS In-Database Code Accelerator for Hadoop | ✓ | ✓ |
SAS Scoring Accelerator for Hadoop | ✓ | ✓ |
DATA Step Processing in Hadoop | ✓ | ✓ |
SAS Grid Manager for Hadoop | ✓ | n/a |
SAS High-Performance Analytics Environment | ✓ | ✓ |
SAS LASR Analytic Server | ✓ | n/a |
SAS Data Loader for Hadoop | ✓ | ✓ |
Details about the minimum supported versions for Hadoop distributions and Kerberos are provided in the following table.
Unless noted otherwise:
- SAS software listed below is for the seventh maintenance release of 9.4 (9.4 M7) with the latest hotfixes applied.
- The HADOOPPLATFORM=SPARK option requires Spark 2.4. This option is available for use with SAS In-Database Code Accelerator for Hadoop, SAS Scoring Accelerator for Hadoop, DATA Step Processing in Hadoop, and SAS High-Performance Analytics.
- Read SAS Support for Alternative Releases of Hadoop Distributions below to understand SAS support for later versions of Hadoop distributions.
- Information about Hadoop JAR files and SAS environment variables for Hadoop is provided in the SAS 9.4 Hadoop Configuration Guide for Base SAS and SAS/ACCESS. Multiple editions of this guide are available.
- Cloudera Data Warehouse - Data Services is supported (for Hive). SAS 9.4M7F or later is required. Bulk loading (BULKLOAD=YES) is not supported.
- SAS/ACCESS Interface to Hadoop on SAS 9.4M7 supports Cloudera Data Platform (CDP) 7.1 for Public Cloud and CDP 7.1 with CDP Private Cloud Data Services 1.5.0 for Private Cloud. Bulk loading (BULKLOAD=YES) is not supported in Private Cloud.
SAS Products, Offerings, and Technologies | CDP 7.x (Private Cloud) [10] [12] [14] | Cloudera CDH 6.2 | Hortonworks HDP 3.1 | MapR Distribution 6.1 | Amazon Web Services EMR 5.30 | Microsoft Azure HDInsight 3.6 and 4.0 |
---|---|---|---|---|---|---|
Base SAS: FILENAME Statement for Hadoop Access Method | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Base SAS: HADOOP Procedure | ✓[7] | ✓ | ✓ | ✓ | ✓ | ✓ |
Base SAS: SQOOP Procedure | ✓[7] | Not Supported | ✓ [4] | ✓ | Not Supported | Not Supported |
SAS Scalable Performance Data Engine | ✓[7] [12] | ✓ | ✓ | ✓ | Not Supported | Not Supported |
SAS Scalable Performance Data Engine SerDe | ✓[7] [12] | ✓ | ✓ | ✓ | Not Supported | Not Supported |
SAS/ACCESS Interface to Hadoop | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
SAS/ACCESS Interface to Spark | Not Supported | ✓ | ✓ | Not Supported | Not Supported | Not Supported |
SAS In-Database Code Accelerator for Hadoop | ✓[7][8][9] | ✓ | ✓ | ✓ | ✓ | Not Supported |
SAS Scoring Accelerator for Hadoop | ✓[7][8][9] | ✓ | ✓ | ✓ | ✓ | Not Supported |
DATA Step Processing in Hadoop | ✓[7] [13] | ✓ | ✓ | ✓ | ✓ | Not Supported |
SAS Grid Manager for Hadoop | ✓[2] [7] | ✓ [2] | ✓ [2] | ✓ [3] | Not Supported | Not Supported |
SAS High-Performance Analytics Environment | ✓[7][8][9] | ✓ | ✓ | ✓ | Not Supported | Not Supported |
SAS LASR Analytic Server [1] | ✓[7] | ✓ | ✓ | ✓ | Not Supported | Not Supported |
SAS Data Loader for Hadoop | ✓[10] [11] | ✓ | ✓ [6] | ✓ [5] | Not Supported | Not Supported |
[1] In addition, Apache Hadoop 0.23, 2.4.0, and 2.7.1 and later versions are supported as the Hadoop cluster that is co-located with SAS LASR Analytic Serverfor access to SASHDAT on HDFS.
[2] Includes a REST API for job submission that results in better performance.
[3] Includes support for cron syntax in coordinator frequency with Oozie. This functionality is needed to schedule flows with recurring time events.
[4] PROC SQOOP requires patches from Hortonworks: HDP 3.1 patch 3.1.0.29-2. See the Hortonworks site for instructions on applying the patch.
[5] Spark is not supported.
[6] Spark on SAS Servers running on Windows and the Cluster/Survive directive are not supported.
[7] On a public cloud, HDFS operations through the REST API are supported.
[8] Supported only with the HADOOPPLATFORM=SPARK option.
[9] Version 18 or later of the SAS Embedded Process for Hadoop is required.
[10] The latest SAS hot fixes are required.
[11] Version 19 or later of the SAS Embedded Process for Hadoop is required.
[12] Not supported for use in a public cloud environment because of the limitations of org.apache.hadoop.fs.FSDataOutputStream.
[13] Not supported for use in a public cloud environment
[14] Only SAS/ACCESS Interface to Hadoop supports CDP 7.1 with CDP Private Cloud Data Services 1.5.0.
Hadoop Security
Details about the supported Hadoop Security configurations
SAS Products, Offerings, and Technologies | Kerberos [2] | Kerberos via REST API [2] | Sentry | Knox | Ranger | HDFS Encryption |
---|---|---|---|---|---|---|
Base SAS: FILENAME Statement for Hadoop Access Method | ✓ | ✓ | n/a | ✓ | n/a | ✓ |
Base SAS: HADOOP Procedure | ✓ | Not Supported | n/a | ✓ | n/a | ✓ |
Base SAS: SQOOP Procedure | ✓ | ✓ | ✓ | Not Supported | ✓ | ✓ |
SAS Scalable Performance Data Engine | ✓ | Not Supported | ✓ | Not Supported | ✓ | ✓ |
SAS Scalable Performance Data Engine SerDe | ✓ | Not Supported | ✓ | Not Supported | ✓ | ✓ |
SAS/ACCESS Interface to Hadoop | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
SAS/ACCESS Interface to Spark | ✓ | ✓ | Not Supported | Not Supported | Not Supported | Not Supported |
SAS/ACCESS Interface to HAWQ | Not Supported | n/a | n/a | Not Supported | Not Supported | ✓ |
SAS/ACCESS Interface to Impala | ✓ | ✓ | ✓ | n/a | n/a | ✓ |
SAS In-Database Code Accelerator for Hadoop | ✓ | ✓ | ✓ | Not Supported | ✓ | ✓ |
SAS Scoring Accelerator for Hadoop | ✓ | ✓ | ✓ | Not Supported | ✓ | ✓ |
DATA Step Processing in Hadoop | ✓ | ✓ | Not Supported | Not Supported | Not Supported | ✓ |
SAS Grid Manager for Hadoop | ✓ | ✓ | Not Supported | Not Supported | Not Supported | Not Supported |
SAS High-Performance Analytics Environment | ✓ | ✓ | ✓ | Not Supported | ✓ | ✓ |
SAS LASR Analytic Server | ✓ | Not Supported | Not Supported | Not Supported | Not Supported | Not Supported |
SAS Data Loader for Hadoop | ✓[1] | ✓[1] | ✓ | Not Supported | Not Supported | ✓ |
Footnotes:
[1] MapR security and Kerberos are not supported
[2] Kerberos not supported if connecting to MapR from Windows host
Hadoop High Availability
SAS currently supports HDFS HA and Hive HA with Cloudera and Hortonworks distributions.
SAS Products, Offerings, and Technologies | HDFS HA | HIVE HA |
---|---|---|
Base SAS: FILENAME Statement for Hadoop Access Method | ✓ | n/a |
Base SAS: HADOOP Procedure | ✓ | n/a |
Base SAS: SQOOP Procedure | ✓ | ✓ |
SAS Scalable Performance Data Engine | ✓ | n/a |
SAS Scalable Performance Data Engine SerDe | ✓ | ✓ |
SAS/ACCESS Interface to Hadoop | ✓ | ✓ |
SAS/ACCESS Interface to Spark | ✓ | n/a |
SAS/ACCESS Interface to Impala | ✓ | n/a |
SAS In-Database Code Accelerator for Hadoop | ✓ | ✓ |
SAS Scoring Accelerator for Hadoop | ✓ | ✓ |
DATA Step Processing in Hadoop | ✓ | ✓ |
SAS Grid Manager for Hadoop | ✓ | n/a |
SAS High-Performance Analytics Environment | ✓ | ✓ |
SAS LASR Analytic Server | ✓ | n/a |
SAS Data Loader for Hadoop | ✓ | ✓ |
Details about the minimum supported versions for Hadoop distributions and Kerberos are provided in the following table.
Unless noted otherwise:
- SAS software listed below is for the sixth maintenance release of 9.4 (9.4 M6) with the latest hot fixes applied.
- The HADOOPPLATFORM=SPARK option requires Spark 2. This option is available for use with SAS In-Database Code Accelerator for Hadoop, SAS Scoring Accelerator for Hadoop, DATA Step Processing in Hadoop, and SAS High-Performance Analytics.
- Read SAS Support for Alternative Releases of Hadoop Distributions below to understand SAS support for later versions of Hadoop distributions.
- Information about Hadoop JAR files and SAS environment variables for Hadoop is provided in the SAS 9.4 Hadoop Configuration Guide for Base SAS and SAS/ACCESS. Multiple editions of this guide are available.
SAS Products, Offerings, and Technologies | Cloudera Data Platform (CDP) 7.1 [11] | Cloudera CDH 5.5 | Cloudera CDH 6.0 | Hortonworks HDP 2.4 | Hortonworks HDP 3.0 | MapR Distribution 5.2 | MapR Distribution 6.0 | Amazon Web Services EMR 5.13 | Microsoft Azure HDInsight 3.6 and 4.0 |
---|---|---|---|---|---|---|---|---|---|
Base SAS: FILENAME Statement for Hadoop Access Method | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Base SAS: HADOOP Procedure | ✓[12] | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Base SAS: SQOOP Procedure | Not Supported | ✓ | Not Supported | ✓ | ✓ [7] | ✓ | ✓ | Not Supported | Not Supported |
SAS Scalable Performance Data Engine | ✓[12][15] | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Not Supported | Not Supported |
SAS Scalable Performance Data Engine SerDe | Not Supported | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Not Supported | Not Supported |
SAS Scalable Performance Data Server | Not Supported | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Not Supported | Not Supported |
SAS/ACCESS Interface to Hadoop | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
SAS In-Database Code Accelerator for Hadoop | Not Supported | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Not Supported |
SAS Scoring Accelerator for Hadoop | Not Supported | ✓ | ✓ | ✓ | Not Supported | ✓ | ✓ | ✓ | Not Supported |
DATA Step Processing in Hadoop | ✓[12] | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Not Supported |
SAS Grid Manager for Hadoop | Not Supported | ✓ [2] | ✓ [2] | ✓ [2] | ✓ [2] | ✓ [3] | ✓ [3] | Not Supported | Not Supported |
SAS High-Performance Analytics Environment | ✓[12][13][14] | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Not Supported | Not Supported |
SAS LASR Analytic Server [1] | ✓[12] | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Not Supported | Not Supported |
SAS Data Loader for Hadoop | Not Supported | ✓ [5] | ✓ [8] | ✓ | ✓ [10] | ✓ [4] | ✓ [9] | Not Supported | Not Supported |
[1] In addition, Apache Hadoop 0.23, 2.4.0, and 2.7.1 versions are supported as the Hadoop cluster that is co-located with SAS LASR Analytic Server for access to SASHDAT on HDFS.
[2] Includes a REST API for job submission that results in better performance.
[3] Includes support for cron syntax in coordinator frequency with Oozie. This functionality is needed to schedule flows with recurring time events.
[4] Supports SAS Servers running on Linux for x64 SAS servers only.
[5] Spark is supported with CDH 5.7 and later releases.
[6] Supports Spark 1.6 only.
[7] PROC SQOOP requires patches from Hortonworks: HDP 3.0 patch 3.0.0.14-4 or HDP 3.1 patch 3.1.0.29-2. See the Hortonworks site for instructions on applying the patch.
[8] Supports Cloudera 6.2 and later releases. Spark 2.4 supported on Linux for x64 SAS servers only.
[9] Spark is not supported.
[10] Spark on SAS Servers running on Windows and the Cluster/Survive directive are not supported.
[11] The latest SAS hot fixes are required.
[12] Supports CDP on-premises (private cloud) deployments only.
[13] Supported only with the HADOOPPLATFORM=SPARK option.
[14] Version 18 of the SAS Embedded Process for Hadoop is required.
[15] With the exception of the use of ACCELWHERE option on Windows.
Details about the minimum supported versions for Hadoop distributions and Kerberos are provided in the following table.
Notes:
- Unless noted otherwise, SAS software listed below is for the fifth maintenance release of 9.4 (9.4 M5).
- SAS 9.4 M5 and earlier provide Limited support for Hadoop. These releases incorporate Java 7, which is not supported by the vendors of Hadoop distributions.
- Read SAS Support for Alternative Releases of Hadoop Distributions below to understand SAS support for later versions of Hadoop distributions.
- Information about Hadoop JAR files and SAS environment variables for Hadoop is provided in the SAS 9.4 Hadoop Configuration Guide for Base SAS and SAS/ACCESS. Multiple editions of this guide are available.
SAS Products, Offerings, and Technologies | Cloudera CDH | Hortonworks HDP [7] | MapR Distribution | Pivotal HD | IBM BigInsights |
---|---|---|---|---|---|
Base SAS: FILENAME Statement for Hadoop Access Method | 5.5 | 2.4 | 5.2 | 3.0 | 4.2 |
Base SAS: HADOOP Procedure | 5.5 | 2.4 | 5.2 | 3.0 | 4.2 |
Base SAS: SQOOP Procedure | 5.5 | 2.4 [9] | 5.2 | Not supported | Not supported |
SAS Scalable Performance Data Engine | 5.5 | 2.4 | 5.2 | 3.0 | 4.2 |
SAS Scalable Performance Data Engine SerDe | 5.5 | 2.4 | 5.2 | 3.0 | 4.2 |
SAS Scalable Performance Data Server | 5.5 | 2.4 | 5.2 | Not supported | Not supported |
SAS/ACCESS Interface to Hadoop | 5.5 | 2.4 | 5.2 | 3.0 | 4.2 |
SAS In-Database Code Accelerator for Hadoop | 5.5 | 2.4 | 5.2 | 3.0 [2] | 4.2 |
SAS Scoring Accelerator for Hadoop | 5.5 | 2.4 | 5.2 | 3.0 [2] | 4.2 |
DATA Step Processing in Hadoop | 5.5 | 2.4 | 5.2 | 3.0 [2] | 4.2 |
SAS Grid Manager for Hadoop | 5.2 [3] | 2.4 [3] | 4.0 [4] | Not supported | Not supported |
SAS High-Performance Analytics Environment | 5.5 | 2.4 | 5.2 | 3.0 [2] | 4.2 |
SAS LASR Analytic Server [1] | 4.7.0 | 1.3.2 | 4.0 | 2.1 [2] | 3.0 |
SAS Data Loader for Hadoop [8] | 5.5 [6] | 2.4 | 5.2 [5] | 3.0 [2] | 4.2 |
Footnotes:
[1] In addition, Apache Hadoop 0.23, 2.4.0, and 2.7.1 versions are supported for the Hadoop cluster that is co-located with SAS LASR Analytic Server for access to SASHDAT on HDFS.
[2] Only delimited text files, such as CSV files, are supported for data exchange between the Hadoop version and SAS software shown.
[3] Includes a REST API for job submission that results in better performance.
[4] Recommended. Includes support for cron syntax in coordinator frequency with Oozie. This functionality is needed in order to schedule flows with recurring time events.
[5] Only Linux for x64 SAS servers are supported.
[6] Spark is supported with CDH 5.7 and later releases.
[7] HDP 2.6.3 and later are not supported.
[8] Only Spark 1.6.x is supported.
[9] A patch must be applied to HDP 3 distributions to support PROC SQOOP with Kerberos. Contact your Hadoop vendor for more information regarding the patch (HDP-3.0.0.14-4)
Details about the minimum supported versions for Hadoop distributions and Kerberos are provided in the following table.
Notes:
- Unless noted otherwise, SAS software listed below is for the fourth maintenance release of 9.4 (9.4 M4).
- Kerberos supported: MIT Kerberos 5 version 1.9 or later.
- Read SAS Support for Alternative Releases of Hadoop Distributions below to understand SAS support for later versions of Hadoop distributions.
- Information about Hadoop JAR files and SAS environment variables for Hadoop is provided in the SAS 9.4 Hadoop Configuration Guide for Base SAS and SAS/ACCESS. Multiple editions of this guide are available.
SAS Products, Offerings, and Technologies | Cloudera CDH | Hortonworks HDP [8] | MapR Distribution [7] | Pivotal HD | IBM BigInsights | Notes |
---|---|---|---|---|---|---|
Base SAS: FILENAME Statement for Hadoop Access Method | 5.8 | 2.5 | 5.2 | 3.0 | 4.2 | |
Base SAS: HADOOP Procedure | 5.8 or later | 2.5 | 5.2 | 3.0 | 4.2 | |
Base SAS: SQOOP Procedure [1] | 5.8 | 2.5 | 5.2 | Not supported | Not supported | |
SAS Scalable Performance Data Engine [2] | 5.8 | 2.5 | 5.2 | 3.0 | 4.2 | |
SAS Scalable Performance Data Engine SerDe | 5.8 | 2.5 | 5.2 | 3.0 | 4.2 | Distribution must include Hive 0.13 |
SAS Scalable Performance Data Server | 5.8 | 2.5 | 5.2 | Not supported | Not supported | |
SAS/ACCESS Interface to Hadoop | 5.8 | 2.5 | 5.2 | 3.0 | 4.2 | Includes support for In-Database Procedures for Hive. |
SAS In-Database Code Accelerator for Hadoop [2] | 5.8 | 2.5 | 5.2 | 3.0 [3] | 4.2 | |
SAS Scoring Accelerator for Hadoop [2] | 5.8 | 2.5 | 5.2 | 3.0 [3] | 4.2 | |
DATA Step Processing in Hadoop | 5.8 | 2.5 | 5.2 | 3.0 [3] | 4.2 | |
SAS Grid Manager for Hadoop [2] | 5.8 [4] | 2.5 [4] [9] | 5.2 [5] | Not supported | Not supported | |
SAS High-Performance Analytics Environment [2] | 5.8 | 2.5 | 5.2 | 3.0 [3] | 4.2 | |
SAS LASR Analytic Server | 4.7.0 | 1.3.2 | 4.0 | 2.1 [3] | 3.0 [1] | Additionally, Apache Hadoop 0.23, 2.4.0, and 2.7.1 and later versions are supported as the Hadoop cluster that is co-located with SAS LASR for access to SASHDAT on HDFS. |
SAS Data Loader for Hadoop [2] | 5.8 | 2.5 | 5.2 [1] [6] | 3.0 [3] | 4.2 | Oozie must be enabled in your Hadoop cluster. |
Footnotes:
[1] Kerberos not supported.
[2] Connecting to a Kerberos-secured Hadoop cluster is not supported if the environment variable SAS_HADOOP_RESTFUL is set to 1.
[3] Only delimited text files, such as CSV files, are supported for data exchange between the Hadoop version and SAS software shown.
[4] Includes a REST API for job submission that results in better performance.
[5] Recommended. Includes support for cron syntax in coordinator frequency with Oozie. This functionality is needed in order to schedule flows with recurring time events.
[6] Only SAS Servers running on Linux for x64 SAS servers are supported.
[7] Kerberos not supported if connecting from a Windows host.
[8] HDP 2.6.3 and later are not supported
Details about the minimum supported versions for Hadoop distributions and Kerberos are provided in the following table.
Notes:
- Unless noted otherwise, SAS software listed below is for the third maintenance release of 9.4 (9.4 M3).
- Read SAS Support for Alternative Releases of Hadoop Distributions below to understand SAS support for later versions of Hadoop distributions.
- Information about Hadoop JAR files and SAS environment variables for Hadoop is provided in the SAS 9.4 Hadoop Configuration Guide for Base SAS and SAS/ACCESS. Multiple editions of this guide are available.
SAS Products, Offerings, and Technologies | Cloudera CDH [1] | Hortonworks HDP [1] [11] | MapR Distribution [1] | Pivotal HD [1] | IBM BigInsights [2] | Notes |
---|---|---|---|---|---|---|
Base SAS: FILENAME Statement for Hadoop Access Method | 4.7.0 | 1.3.2 | 3.1 | 1.1.1 | 2.1 [3] | |
Base SAS: HADOOP Procedure | 4.7.0 | 1.3.2 | 3.1 | 1.1.1 | 2.1 [3] | |
Base SAS: SQOOP Procedure [2] | 5.2.0 | 2.2 | 4.0.2 | Not supported | Not supported | Requires Oozie 4.1.0 for best results. |
SAS Scalable Performance Data Engine | 4.7.0 | 2.0 | 4.0 | 2.1 | 3.0 | |
SAS Scalable Performance Data Engine SerDe [8] | 5.2.0 | 2.2 | 4.0 | Not supported | Not supported | |
SAS Scalable Performance Data Server | 5.7.0 | 2.4 | 5.1 | Not supported | Not supported | |
SAS/ACCESS Interface to Hadoop | 4.7.0 | 1.3.2 | 3.1 | 1.1.1 | 2.1 [3] | Includes support for In-Database Procedures for Hive. |
SAS In-Database Code Accelerator for Hadoop | 5.0.0 [8] | 2.0 [8] | 4.0 [8] | 2.1 [4] | 3.0 [3] | |
SAS Scoring Accelerator for Hadoop | 4.7.0 [8] | 1.3.2 [8] | 4.0 [8] | 2.1 [4] | 3.0 [3] | |
DATA Step Processing in Hadoop | 4.7.0 | 1.3.2 | 4.0.2 | 2.1 [4] | 3.0 [3] | |
SAS Grid Manager for Hadoop | 5.4 [9] | 2.2 [9] | 4.1 [10] | Not supported | Not supported | |
SAS High-Performance Analytics Environment | 4.7.0 | 1.3.2 | 4.0 | 2.1 [4] | 3.0 [3] | |
SAS LASR Analytic Server | 4.7.0 | 1.3.2 | 4.0 | 2.1 [4] | 3.0 [3] | |
SAS Data Loader for Hadoop | View supported Hadoop distributions and complete system requirements from the SAS Data Loader for Hadoop documentation page. |
Footnotes:
[1] Kerberos supported: MIT Kerberos 5 version 1.9 or later. Connecting to a Kerberos-secured Hadoop cluster is not supported if the environment variable SAS_HADOOP_RESTFUL is set to 1.
[2] Kerberos not supported.
[3] Prior to IBM BigInsights 4.2, only delimited text files, such as CSV files, are supported for data exchange between the Hadoop version and SAS software shown.
[4] Only delimited text files, such as CSV files, are supported for data exchange between the Hadoop version and SAS software shown.
[5] VARCHAR data type is not supported.
[6] PROC DS2 and PROC FEDSQL are not supported.
[7] Supports Cloudera CDH 5.2 only, and requires 2.0 Impala server with a 2.5.22 Impala ODBC driver.
[8] Requires Hive 0.13 to support extended file types such as ORC.
[9] Includes a REST API for job submission that results in better performance.
[10] Recommended. Includes support for cron syntax in coordinator frequency with Oozie. This functionality is needed in order to schedule flows with recurring time events.
[11] HDP 2.6.3 and later are not supported
SAS documents the specific set of Hadoop distributions supported with each SAS product release. SAS also documents the minimum release version of each distribution supported with that SAS product release. These represent the specific distribution versions that SAS used to test and validate the associated SAS product release. Each supported Hadoop distribution release is documented using the "dot" version used to test and validate the associated SAS product.
Unless otherwise noted, SAS only supports minor releases within the documented major release.
Please note that if a customer elects to use a version that is later than a documented Hadoop version, different configuration steps than those described in the associated SAS product’s installation documentation are likely to be required. These can include but are not necessarily limited to
- Hadoop client runtime file sets and locations.
- Hadoop configuration files settings and locations.
- Hadoop and SAS environment variable and system option settings.
Some SAS products that integrate with Hadoop may be delivered to customers with a fixed product architecture and deployment configuration design. Such a design cannot be modified to allow use with alternative Hadoop release versions.
SAS expects customers to have the appropriate skills to resolve differences between the supported release and the alternative release being used. By electing to use an alternative release, the customer acknowledges that they have made the appropriate investment into resolving the differences inherent in that alternative release.
All attempts to re-create any problematic customer scenario at SAS will be done using an officially supported Hadoop version release.
If SAS is unable to reproduce the problem, the customer will be required to perform further diagnostics on their own to isolate the problem up to and including reproducing the problem using a supported Hadoop version release.
Support for Apache Hadoop Software Distributed with SAS® Software
Certain SAS software requires a Hadoop distribution as a pre-requisite. Prior to SAS 9.4 TS1M4, as a convenience for customers, Apache Hadoop software was distributed with that SAS software, as documented in License Information for Third-Party Software Distributed with SAS® Software.
SAS provides support for the integration of Apache Hadoop delivered with SAS software. SAS does not provide support for the installation, or other aspects of the administration and operation, of Apache Hadoop. For production environments, customers should seek out a well-supported third-party distribution of Hadoop. This ensures that they can turn to a dedicated Hadoop vendor for assistance with their production Hadoop needs.
Recommended Resources
Explore our training options, including on-site classroom, live web, e-learning and one-on-one mentoring.
Validate your SAS knowledge and skills by earning a globally recognized credential from SAS.