Support for Hadoop

SAS 9.4 M6

Details about the minimum supported versions for Hadoop distributions and Kerberos are provided in the following table.
Unless noted otherwise:

  • SAS software listed below is for the sixth maintenance release of 9.4 (9.4 M6) with the latest hotfixes applied.
  • The HADOOPPLATFORM=SPARK option requires Spark 2. This option is available for use with SAS In-Database Code Accelerator for Hadoop, SAS Scoring Accelerator for Hadoop, DATA Step Processing in Hadoop, and SAS High-Performance Analytics.
SAS Products, Offerings, and TechnologiesCloudera CDH 5.5Cloudera CDH 6.0Hortonworks HDP 2.4Hortonworks HDP 3.0MapR Distribution 5.2MapR Distribution 6.0Pivotal HD 2.1Pivotal HD 3.0 IBM BigInsights 4.2 Amazon Web Services EMR 5.13Microsoft Azure HDInsight 3.6

Base SAS: FILENAME Statement for Hadoop Access Method

Base SAS: HADOOP Procedure

Base SAS: SQOOP Procedure

Not Supported

✓ [8]

Not Supported

Not Supported

Not Supported

Not Supported

Not Supported

SAS Scalable Performance Data Engine

Not Supported

Not Supported

SAS Scalable Performance Data Engine SerDe

Not Supported

Not Supported

SAS Scalable Performance Data Server

Not Supported

Not Supported

Not Supported

Not Supported

Not Supported

SAS/ACCESS Interface to Hadoop

SAS In-Database Code Accelerator for Hadoop


Not Supported

SAS Scoring Accelerator for Hadoop

Not Supported

✓ [7]

Not Supported

DATA Step Processing in Hadoop


Not Supported

SAS Grid Manager for Hadoop

✓ [2]

✓ [2]

✓ [2]

✓ [2]

✓ [3]

✓ [3]

Not Supported

Not Supported

Not Supported

Not Supported

Not Supported

SAS High-Performance Analytics Environment


Not Supported

Not Supported

SAS LASR Analytic Server [1]

Not Supported

Not Supported

SAS Data Loader for Hadoop [6]

✓ [5]

Not Supported


✓ [4]

Not Supported

Not Supported

Not Supported

Other Hadoop Products
SAS/ACCESS Interface to Impala

DBMS Products Required:

  • Impala server version 2.6 or later
  • ODBC Driver for Impala release 2.5.34 or later

Footnotes:

[1] Additionally, Apache Hadoop 0.23, 2.4.0, and 2.7.1 and later versions are supported as the Hadoop cluster that is co-located with SAS LASR for access to SASHDAT on HDFS.
[2] Includes a REST API for job submission that results in better performance.
[3] Includes support for cron syntax in coordinator frequency with Oozie. This functionality is needed to schedule flows with recurring time events.
[4] Supports Linux for x64 SAS servers only.
[5] Spark is supported with CDH 5.7 and later releases.
[6] Supports Spark 1.6 only.
[7] The SAS Scoring Accelerator does not currently support the use of Hive tables stored in S3 when issued from a Microsoft Windows client.
[8] PROC SQOOP requires patches from Hortonworks: HDP 3.0 patch 3.0.0.14-4 or HDP 3.1 patch 3.1.0.29-2. Please see the Hortonworks site for instructions on applying the patch.

Hadoop Security

Details about the supported Hadoop Security configurations

SAS Products, Offerings, and TechnologiesKerberos [2]Kerberos via REST API [2]SentryKnoxRangerHDFS Encryption

Base SAS: FILENAME Statement for Hadoop Access Method

n/a

n/a

Base SAS: HADOOP Procedure

Not Supported

n/a

n/a

Base SAS: SQOOP Procedure

Not Supported

SAS Scalable Performance Data Engine

Not Supported

Not Supported

SAS Scalable Performance Data Engine SerDe

Not Supported

Not Supported

SAS Scalable Performance Data Server

Not Supported

Not Supported

SAS/ACCESS Interface to Hadoop

SAS/ACCESS Interface to HAWQ

Not Supported

n/a

n/a

Not Supported

Not Supported

SAS/ACCESS Interface to Impala

n/a

n/a

SAS In-Database Code Accelerator for Hadoop

Not Supported

SAS Scoring Accelerator for Hadoop

Not Supported

DATA Step Processing in Hadoop

Not Supported

Not Supported

Not Supported

SAS Grid Manager for Hadoop

Not Supported

Not Supported

Not Supported

Not Supported

SAS High-Performance Analytics Environment

✓ 

Not Supported

SAS LASR Analytic Server

Not Supported

Not Supported

Not Supported

Not Supported

Not Supported

SAS Data Loader for Hadoop

✓[1]

✓[1]

✓[3]

Not Supported

Not Supported

 

Footnotes:

[1] Kerberos not supported for MapR

[2] Kerberos not supported if connecting to MapR from Windows host

[3] Sentry supported with CDH 5.11 and later releases

Hadoop High Availability

SAS currently supports HDFS HA and Hive HA with Cloudera and Hortonworks distributions.

SAS Products, Offerings, and TechnologiesHDFS HAHIVE HA

Base SAS: FILENAME Statement for Hadoop Access Method

n/a

Base SAS: HADOOP Procedure

n/a

Base SAS: SQOOP Procedure

SAS Scalable Performance Data Engine

n/a

SAS Scalable Performance Data Engine SerDe

SAS Scalable Performance Data Server

n/a

SAS/ACCESS Interface to Hadoop

SAS/ACCESS Interface to Impala

n/a

SAS In-Database Code Accelerator for Hadoop

SAS Scoring Accelerator for Hadoop

DATA Step Processing in Hadoop

SAS Grid Manager for Hadoop

n/a

SAS High-Performance Analytics Environment

SAS LASR Analytic Server

n/a

SAS Data Loader for Hadoop

 

SAS 9.4 M5

Details about the minimum supported versions for Hadoop distributions and Kerberos are provided in the following table.

Notes:

SAS Products, Offerings, and TechnologiesCloudera CDHHortonworks HDPMapR DistributionPivotal HDIBM BigInsights

Base SAS: FILENAME Statement for Hadoop Access Method

5.5, 6.0 [10]

2.4, 3.0 [7,10]

5.2 

3.0 

4.2 

Base SAS: HADOOP Procedure

5.5, 6.0 [10]

2.4, 3.0 [7,10]

5.2 

3.0 

4.2   

Base SAS: SQOOP Procedure

5.5 

2.4, 3.0 [7,9,10]

5.2 

Not supported

Not supported

SAS Scalable Performance Data Engine

5.5, 6.0 [10]

2.4, 3.0 [7,10]

5.2 

3.0 

4.2 

SAS Scalable Performance Data Engine SerDe

5.5, 6.0 [10]

2.4, 3.0 [7,10]

5.2 

3.0 

4.2 

SAS Scalable Performance Data Server

5.5, 6.0 [10]

2.4, 3.0 [7,10]

5.2 

Not supported

Not supported

SAS/ACCESS Interface to Hadoop

5.5, 6.0 [10] 

2.4, 3.0 [7,10]

5.2 

3.0 

4.2 

SAS In-Database Code Accelerator for Hadoop

5.5, 6.0 [10]

2.4  [7]

5.2 

3.0   [2]

4.2 

SAS Scoring Accelerator for Hadoop

5.5, 6.0 [10]

2.4  [7]

5.2 

3.0   [2]

4.2 

DATA Step Processing in Hadoop

5.5, 6.0 [10]

2.4  [7]

5.2 

3.0   [2]

4.2 

SAS Grid Manager for Hadoop

5.2, 6.0 [3,10]

2.4, 3.0 [3,10]

4.0   [4]

Not supported

Not supported

SAS High-Performance Analytics Environment

5.5, 6.0 [10]

2.4  [7]

5.2 

3.0   [2]

4.2 

SAS LASR Analytic Server [1]

4.7.0, 6.0[10]

1.3.2, 3.0 [7,10]

4.0 

2.1   [2]

3.0

SAS Data Loader for Hadoop [8]

5.5 [6]

2.4  [7]

5.2   [5]

3.0   [2]

4.2 

Other Hadoop Products

SAS/ACCESS Interface to HAWQ

  • DBMS Product Required: HAWQ Database version 2.0 or later

SAS/ACCESS Interface to Impala

DBMS Products Required:

  • Impala server version 2.6 or later
  • ODBC Driver for Impala release 2.5.34 or later

Footnotes:

[1] Additionally, Apache Hadoop 0.23, 2.4.0, and 2.7.1 and later versions are supported as the Hadoop cluster that is co-located with SAS LASR for access to SASHDAT on HDFS.
[2] Only delimited text files, such as CSV files, are supported for data exchange between the Hadoop version and SAS software shown.
[3] Includes a REST API for job submission that results in better performance.
[4] Recommended. Includes support for cron syntax in coordinator frequency with Oozie. This functionality is needed in order to schedule flows with recurring time events.
[5] Only Linux for x64 SAS servers are supported.
[6] Spark supported with CDH 5.7 and later releases.
[7] Installation steps outline in the SAS Note 61703 must be applied when connecting to HDP 2.6.3 and later releases.
[8] Only Spark 1.6.x is supported.
[9] A patch must be applied to HDP 3 distributions to support PROC SQOOP with Kerberos. Please contact your Hadoop vendor for more information regarding the patch (HDP-3.0.0.14-4).
[10] A hotfix is required to support CDH 6 and HDP 3. Please contact technical support for more information.
 

Hadoop Security

Details about the supported Hadoop Security configurations

SAS Products, Offerings, and TechnologiesKerberos [3]Kerberos via REST API [3]SentryKnoxRangerHDFS Encryption

Base SAS: FILENAME Statement for Hadoop Access Method

n/a

n/a

Base SAS: HADOOP Procedure

Not Supported

n/a

n/a

Base SAS: SQOOP Procedure

 ✓

Not Supported

SAS Scalable Performance Data Engine

Not Supported

Not Supported

SAS Scalable Performance Data Engine SerDe

Not Supported

Not Supported

SAS Scalable Performance Data Server

Not Supported

Not Supported

SAS/ACCESS Interface to Hadoop

SAS/ACCESS Interface to HAWQ

Not Supported

n/a

n/a

Not Supported

Not Supported

SAS/ACCESS Interface to Impala

n/a

n/a

SAS In-Database Code Accelerator for Hadoop

Not Supported

SAS Scoring Accelerator for Hadoop

Not Supported

DATA Step Processing in Hadoop

 Not Supported

Not Supported

Not Supported

SAS Grid Manager for Hadoop

Not Supported

Not Supported

Not Supported

Not Supported

SAS High-Performance Analytics Environment

✓ 

Not Supported

SAS LASR Analytic Server

✓[1]

Not Supported

Not Supported

Not Supported

Not Supported

Not Supported

SAS Data Loader for Hadoop

✓[2]

✓[2]

✓[4]

Not Supported

Not Supported

 

Footnotes:

[1] Kerberos not supported for IBM BigInsights

[2] Kerberos not supported for MapR

[3] Kerberos not supported if connecting to MapR from Windows host

[4] Sentry supported with CDH 5.11 and later releases

Hadoop High Availability

SAS currently supports HDFS HA and Hive HA with Cloudera and Hortonworks distributions.

SAS Products, Offerings, and TechnologiesHDFS HAHIVE HA

Base SAS: FILENAME Statement for Hadoop Access Method

n/a

Base SAS: HADOOP Procedure

n/a

Base SAS: SQOOP Procedure

SAS Scalable Performance Data Engine

n/a

SAS Scalable Performance Data Engine SerDe

SAS Scalable Performance Data Server

n/a

SAS/ACCESS Interface to Hadoop

SAS/ACCESS Interface to Impala

n/a

SAS In-Database Code Accelerator for Hadoop

SAS Scoring Accelerator for Hadoop

DATA Step Processing in Hadoop

SAS Grid Manager for Hadoop

n/a

SAS High-Performance Analytics Environment

SAS LASR Analytic Server

n/a

SAS Data Loader for Hadoop

 

SAS 9.4 M4

Details about the minimum supported versions for Hadoop distributions and Kerberos are provided in the following table.

Notes:

 

SAS Products, Offerings, and TechnologiesCloudera CDHHortonworks HDPMapR Distribution [7]Pivotal HDIBM BigInsightsNotes
Base SAS: FILENAME Statement for Hadoop Access Method5.8 or later2.5 or later [8]5.2 or later3.0 or later4.2 or later 
Base SAS: HADOOP Procedure5.8 or later2.5 or later [8]5.2 or later3.0 or later4.2 or later  
Base SAS: SQOOP Procedure [1]5.8 or later2.5 or later [8]5.2 or laterNot supportedNot supported 
SAS Scalable Performance Data Engine [2]5.8 or later2.5 or later [8]5.2 or later3.0 or later4.2 or later 
SAS Scalable Performance Data Engine SerDe5.8 or later2.5 or later [8]5.2 or later3.0 or later4.2 or laterDistribution must include Hive 0.13 or later
SAS Scalable Performance Data Server5.8 or later2.5 or later [8]5.2 or laterNot supportedNot supported 
SAS/ACCESS Interface to Hadoop5.8 or later2.5 or later [8]5.2 or later3.0 or later4.2 or laterIncludes support for In-Database Procedures for Hive.
SAS/ACCESS Interface to HAWQN/ANot supported [8]N/A3.0 or later [1]N/AMinimum supported version for HAWQ is 1.3.
SAS/ACCESS Interface to Impala5.8 or laterN/A [8]5.2 or laterN/AN/A 
SAS In-Database Code Accelerator for Hadoop [2]5.8 or later2.5 or later5.2 or later3.0 or later [3]4.2 or later 
SAS Scoring Accelerator for Hadoop [2]5.8 or later2.5 or later [8]5.2 or later3.0 or later [3]4.2 or later 
DATA Step Processing in Hadoop5.8 or later2.5 or later  [8]5.2 or later3.0 or later [3]4.2 or later 
SAS Grid Manager for Hadoop [2]5.8 or later [4]2.5 or later [4] [9]5.2 or later [5]Not supportedNot supported 
SAS High-Performance Analytics Environment [2]5.8 or later2.5 or later  [8]5.2 or later3.0 or later [3]4.2 or later 
SAS LASR Analytic Server4.7.0 or later1.3.2 or later  [8]4.0 or later2.1 or later [3]3.0 or later [1]Additionally, Apache Hadoop 0.23, 2.4.0, and 2.7.1 and later versions are supported as the Hadoop cluster that is co-located with SAS LASR for access to SASHDAT on HDFS.
SAS Data Loader for Hadoop [2]5.8 or later2.5 or later [8]
5.2 or later [1] [6]
3.0 or later [3]4.2 or laterOozie must be enabled in your Hadoop cluster.

Footnotes:

[1] Kerberos not supported.
[2] Connecting to a Kerberos-secured Hadoop cluster is not supported if the environment variable SAS_HADOOP_RESTFUL is set to 1.
[3] Only delimited text files, such as CSV files, are supported for data exchange between the Hadoop version and SAS software shown.
[4] Includes a REST API for job submission that results in better performance.
[5] Recommended. Includes support for cron syntax in coordinator frequency with Oozie. This functionality is needed in order to schedule flows with recurring time events.
[6] Only Linux for x64 SAS servers are supported.
[7] Kerberos not supported if connecting from a Windows host.
[8] Installation steps outline in the SAS Note 61703 must be applied when connecting to HDP 2.6.3 and later releases
[9] HDP 2.6.3 and later is not supported

SAS 9.4 M3

Details about the minimum supported versions for Hadoop distributions and Kerberos are provided in the following table.

Notes:

SAS Products, Offerings, and TechnologiesCloudera CDH [1]Hortonworks HDP [1]MapR Distribution [1]Pivotal HD [1]IBM BigInsights [2]Notes
Base SAS: FILENAME Statement for Hadoop Access Method4.7.0 or later1.3.2 or later [11]3.1 or later1.1.1 or later2.1 or later [3] 
Base SAS: HADOOP Procedure4.7.0 or later1.3.2 or later [11]3.1 or later1.1.1 or later2.1 or later [3] 
Base SAS: SQOOP Procedure [2]5.2.0 or later2.2 or later [11]4.0.2 or laterNot supportedNot supportedRequires Oozie 4.1.0 or later for best results.
SAS Scalable Performance Data Engine4.7.0 or later2.0 or later [11]4.0 or later2.1 or later3.0 or later 
SAS Scalable Performance Data Engine SerDe [8]5.2.0 or later2.2 or later [11]4.0 or laterNot supportedNot supported 
SAS Scalable Performance Data Server5.7.0 or later2.4 or later [11]5.1 or laterNot supportedNot supported 
SAS/ACCESS Interface to Hadoop4.7.0 or later1.3.2 or later [11]3.1 or later1.1.1 or later2.1 or later [3]Includes support for In-Database Procedures for Hive.
SAS/ACCESS Interface to HAWQN/AN/A [8] [11]N/A2.1 or later [2]N/AMinimum supported version for HAWQ is 1.2.1.
SAS/ACCESS Interface to Impala5.0.0 or laterN/A [8] [11]4.0 or later [5] [6]N/AN/ARequires a 2.0 Impala Server with a 2.5.22 Impala ODBC drive to support VARCHAR data type.

Support for In-Database Procedures for Impala is limited. [7]
SAS In-Database Code Accelerator for Hadoop5.0.0 or later [8]2.0 or later [8]4.0 or later [8]2.1 or later [4]3.0 or later [3] 
SAS Scoring Accelerator for Hadoop4.7.0 or later [8]1.3.2 or later [8] [11]4.0 or later [8]2.1 or later [4]3.0 or later [3] 
DATA Step Processing in Hadoop4.7.0 or later1.3.2 or later  [11]4.0.2 or later2.1 or later [4]3.0 or later [3] 
SAS Grid Manager for Hadoop5.2 or later
5.4 or later [9]
2.1 or later [12]
2.2 or later [9] [12]
4.0 or later
4.1 or later [10]
Not supportedNot supported 
SAS High-Performance Analytics Environment4.7.0 or later1.3.2 or later  [11]4.0 or later2.1 or later [4]3.0 or later [3] 
SAS LASR Analytic Server4.7.0 or later1.3.2 or later [11]4.0 or later2.1 or later [4]3.0 or later [3] 
SAS Data Loader for HadoopView supported Hadoop distributions and complete system requirements from the SAS Data Loader for Hadoop documentation page. [11]

 

Footnotes:

[1] Kerberos supported: MIT Kerberos 5 version 1.9 or later. Connecting to a Kerberos-secured Hadoop cluster is not supported if the environment variable SAS_HADOOP_RESTFUL is set to 1.
[2] Kerberos not supported.
[3] Prior to IBM BigInsights 4.2 only delimited text files, such as CSV files, are supported for data exchange between the Hadoop version and SAS software shown.
[4] Only delimited text files, such as CSV files, are supported for data exchange between the Hadoop version and SAS software shown.
[5] VARCHAR data type is not supported.
[6] PROC DS2 and PROC FEDSQL are not supported.
[7] Supports Cloudera CDH 5.2 only, and requires 2.0 Impala server with a 2.5.22 Impala ODBC driver.
[8] Requires Hive 0.13 to support extended file types such as ORC.
[9] Includes a REST API for job submission that results in better performance.
[10] Recommended. Includes support for cron syntax in coordinator frequency with Oozie. This functionality is needed in order to schedule flows with recurring time events.
[11] Installation steps outline in the SAS Note 61703 must be applied when connecting to HDP 2.6.3 and later releases
[12] HDP 2.6.3 and later is not supported

SAS Support for Alternative Releases of Hadoop Distributions

SAS documents the specific set of Hadoop distributions supported with each SAS product release. SAS also documents the minimum release version of each distribution supported with that SAS product release. These represent the specific distribution versions that SAS used to test and validate the associated SAS product release. Each supported Hadoop distribution release is documented using the "dot" version used to test and validate the associated SAS product.

Unless otherwise noted, SAS only supports minor releases within the documented major release.

Please note that if a customer elects to use a version that is later than a documented Hadoop version, different configuration steps than those described in the associated SAS product’s installation documentation are likely to be required. These can include but are not necessarily limited to

  • Hadoop client runtime file sets and locations.
  • Hadoop configuration files settings and locations.
  • Hadoop and SAS environment variable and system option settings.

Some SAS products that integrate with Hadoop may be delivered to customers with a fixed product architecture and deployment configuration design. Such a design cannot be modified to allow use with alternative Hadoop release versions.

SAS expects customers to have the appropriate skills to resolve differences between the supported release and the alternative release being used. By electing to use an alternative release, the customer acknowledges that they have made the appropriate investment into resolving the differences inherent in that alternative release.

All attempts to re-create any problematic customer scenario at SAS will be done using an officially supported Hadoop version release.

If SAS is unable to reproduce the problem, the customer will be required to perform further diagnostics on their own to isolate the problem up to and including reproducing the problem using a supported Hadoop version release.

Support for Apache Hadoop Software Distributed with SAS® Software

Certain SAS software requires a Hadoop distribution as a pre-requisite. Prior to SAS 9.4 TS1M4, as a convenience for customers, Apache Hadoop software was distributed with that SAS software, as documented in License Information for Third-Party Software Distributed with SAS® Software.

SAS provides support for the integration of Apache Hadoop delivered with SAS software. SAS does not provide support for the installation, or other aspects of the administration and operation, of Apache Hadoop. For production environments, customers should seek out a well-supported third-party distribution of Hadoop. This ensures that they can turn to a dedicated Hadoop vendor for assistance with their production Hadoop needs.

Recommended Resources


Explore our training options, including on-site classroom, live web, e-learning and one-on-one mentoring.


Validate your SAS knowledge and skills by earning a globally recognized credential from SAS.


Explore documentation on topics of interest to SAS administrators.
 

Back to Top