Support for Hadoop

SAS 9.4 M6

Details about the minimum supported versions for Hadoop distributions and Kerberos are provided in the following table.

Notes:

SAS Products, Offerings, and TechnologiesCloudera CDHHortonworks HDPMapR DistributionAmazon EMR [7][8]
Microsoft Azure HDInsight

Base SAS: FILENAME Statement for Hadoop Access Method

5.5, 6.0

2.4, 3.0

5.2, 6.0

5.13  

3.6  

Base SAS: HADOOP Procedure

5.5, 6.0

2.4, 3.0

5.2, 6.0

5.13  

3.6    

Base SAS: SQOOP Procedure

5.5 

2.4, 3.0

5.2, 6.0

Not supported

Not supported

SAS Scalable Performance Data Engine

5.5, 6.0

2.4, 3.0

5.2, 6.0

Not supported

Not supported

SAS Scalable Performance Data Engine SerDe

5.5, 6.0

2.4, 3.0

5.2, 6.0

Not supported

Not supported

SAS Scalable Performance Data Server

5.5, 6.0

2.4, 3.0

5.2, 6.0

Not supported

Not supported

SAS/ACCESS Interface to Hadoop

5.5, 6.0

2.4, 3.0

5.2, 6.0

5.13

3.6  

SAS In-Database Code Accelerator for Hadoop

5.5, 6.0

2.4

5.2, 6.0

5.13

Not supported  

SAS Scoring Accelerator for Hadoop

5.5, 6.0

2.4

5.2, 6.0

5.13

Not supported  

DATA Step Processing in Hadoop

5.5, 6.0

2.4

5.2, 6.0

5.13

Not supported  

SAS Grid Manager for Hadoop

5.2   [2]

2.4   [2] 

4.0   [3]

Not supported

Not supported

SAS High-Performance Analytics Environment

5.5, 6.0

2.4

5.2, 6.0

Not supported

Not supported  

SAS LASR Analytic Server [1]

5.5

2.4

5.2  

Not supported

Not supported

SAS Data Loader for Hadoop [6]

5.5 [5]

2.4

5.2   [4]

Not supported

Not supported

Other Hadoop Products

SAS/ACCESS Interface to HAWQ

  • DBMS Product Required: HAWQ Database version 2.0 or later

SAS/ACCESS Interface to Impala

DBMS Products Required:

  • Impala server version 2.6 or later
  • ODBC Driver for Impala release 2.5.41 or later

Footnotes:

[1] Additionally, Apache Hadoop 0.23, 2.4.0, and 2.7.1 and later versions are supported as the Hadoop cluster that is co-located with SAS LASR for access to SASHDAT on HDFS.
[2] Includes a REST API for job submission that results in better performance.
[3] Recommended. Includes support for cron syntax in coordinator frequency with Oozie. This functionality is needed in order to schedule flows with recurring time events.
[4] Only Linux for x64 SAS servers are supported.
[5] Spark supported with CDH 5.7 and later releases.
[6] Only Spark 1.6.x is supported.
[7] Amazon S3 is not supported.
[8] Spark is not supported.

Hadoop Security

Details about the supported Hadoop Security configurations

SAS Products, Offerings, and TechnologiesKerberos [2]Kerberos via REST API [2]SentryKnoxRangerHDFS Encryption

Base SAS: FILENAME Statement for Hadoop Access Method

n/a

n/a

Base SAS: HADOOP Procedure

Not Supported

n/a

n/a

Base SAS: SQOOP Procedure

Not Supported

SAS Scalable Performance Data Engine

Not Supported

Not Supported

SAS Scalable Performance Data Engine SerDe

Not Supported

Not Supported

SAS Scalable Performance Data Server

Not Supported

Not Supported

SAS/ACCESS Interface to Hadoop

SAS/ACCESS Interface to HAWQ

Not Supported

n/a

n/a

Not Supported

Not Supported

SAS/ACCESS Interface to Impala

n/a

n/a

SAS In-Database Code Accelerator for Hadoop

Not Supported

SAS Scoring Accelerator for Hadoop

Not Supported

DATA Step Processing in Hadoop

Not Supported

Not Supported

Not Supported

SAS Grid Manager for Hadoop

Not Supported

Not Supported

Not Supported

Not Supported

SAS High-Performance Analytics Environment

✓ 

Not Supported

SAS LASR Analytic Server

Not Supported

Not Supported

Not Supported

Not Supported

Not Supported

SAS Data Loader for Hadoop

✓[1]

✓[1]

✓[3]

Not Supported

Not Supported

 

Footnotes:

[1] Kerberos not supported for MapR

[2] Kerberos not supported if connecting to MapR from Windows host

[3] Sentry supported with CDH 5.11 and later releases

Hadoop High Availability

SAS currently supports HDFS HA and Hive HA with Cloudera and Hortonworks distributions.

SAS Products, Offerings, and TechnologiesHDFS HAHIVE HA

Base SAS: FILENAME Statement for Hadoop Access Method

n/a

Base SAS: HADOOP Procedure

n/a

Base SAS: SQOOP Procedure

SAS Scalable Performance Data Engine

n/a

SAS Scalable Performance Data Engine SerDe

SAS Scalable Performance Data Server

n/a

SAS/ACCESS Interface to Hadoop

SAS/ACCESS Interface to Impala

n/a

SAS In-Database Code Accelerator for Hadoop

SAS Scoring Accelerator for Hadoop

DATA Step Processing in Hadoop

SAS Grid Manager for Hadoop

n/a

SAS High-Performance Analytics Environment

SAS LASR Analytic Server

n/a

SAS Data Loader for Hadoop

 

SAS 9.4 M5

Details about the minimum supported versions for Hadoop distributions and Kerberos are provided in the following table.

Notes:

SAS Products, Offerings, and TechnologiesCloudera CDHHortonworks HDPMapR DistributionPivotal HDIBM BigInsights

Base SAS: FILENAME Statement for Hadoop Access Method

5.5 

2.4  [7]

5.2 

3.0 

4.2 

Base SAS: HADOOP Procedure

5.5

2.4  [7]

5.2 

3.0 

4.2   

Base SAS: SQOOP Procedure

5.5 

2.4  [7]

5.2 

Not supported

Not supported

SAS Scalable Performance Data Engine

5.5 

2.4  [7]

5.2 

3.0 

4.2 

SAS Scalable Performance Data Engine SerDe

5.5 

2.4  [7]

5.2 

3.0 

4.2 

SAS Scalable Performance Data Server

5.5 

2.4  [7]

5.2 

Not supported

Not supported

SAS/ACCESS Interface to Hadoop

5.5 

2.4  [7]

5.2 

3.0 

4.2 

SAS In-Database Code Accelerator for Hadoop

5.5 

2.4  [7]

5.2 

3.0   [2]

4.2 

SAS Scoring Accelerator for Hadoop

5.5 

2.4  [7]

5.2 

3.0   [2]

4.2 

DATA Step Processing in Hadoop

5.5 

2.4  [7]

5.2 

3.0   [2]

4.2 

SAS Grid Manager for Hadoop

5.2   [3]

2.4   [3] 

4.0   [4]

Not supported

Not supported

SAS High-Performance Analytics Environment

5.5 

2.4  [7]

5.2 

3.0   [2]

4.2 

SAS LASR Analytic Server [1]

4.7.0 

1.3.2 [7]

4.0 

2.1   [2]

3.0

SAS Data Loader for Hadoop [8]

5.5 [6]

2.4  [7]

5.2   [5]

3.0   [2]

4.2 

Other Hadoop Products

SAS/ACCESS Interface to HAWQ

  • DBMS Product Required: HAWQ Database version 2.0 or later

SAS/ACCESS Interface to Impala

DBMS Products Required:

  • Impala server version 2.6 or later
  • ODBC Driver for Impala release 2.5.34 or later

Footnotes:

[1] Additionally, Apache Hadoop 0.23, 2.4.0, and 2.7.1 and later versions are supported as the Hadoop cluster that is co-located with SAS LASR for access to SASHDAT on HDFS.
[2] Only delimited text files, such as CSV files, are supported for data exchange between the Hadoop version and SAS software shown.
[3] Includes a REST API for job submission that results in better performance.
[4] Recommended. Includes support for cron syntax in coordinator frequency with Oozie. This functionality is needed in order to schedule flows with recurring time events.
[5] Only Linux for x64 SAS servers are supported.
[6] Spark supported with CDH 5.7 and later releases.
[7] Installation steps outline in the SAS Note 61703 must be applied when connecting to HDP 2.6.3 and later releases.
[8] Only Spark 1.6.x is supported.

Hadoop Security

Details about the supported Hadoop Security configurations

SAS Products, Offerings, and TechnologiesKerberos [3]Kerberos via REST API [3]SentryKnoxRangerHDFS Encryption

Base SAS: FILENAME Statement for Hadoop Access Method

n/a

n/a

Base SAS: HADOOP Procedure

Not Supported

n/a

n/a

Base SAS: SQOOP Procedure

 ✓

Not Supported

SAS Scalable Performance Data Engine

Not Supported

Not Supported

SAS Scalable Performance Data Engine SerDe

Not Supported

Not Supported

SAS Scalable Performance Data Server

Not Supported

Not Supported

SAS/ACCESS Interface to Hadoop

SAS/ACCESS Interface to HAWQ

Not Supported

n/a

n/a

Not Supported

Not Supported

SAS/ACCESS Interface to Impala

n/a

n/a

SAS In-Database Code Accelerator for Hadoop

Not Supported

SAS Scoring Accelerator for Hadoop

Not Supported

DATA Step Processing in Hadoop

 Not Supported

Not Supported

Not Supported

SAS Grid Manager for Hadoop

Not Supported

Not Supported

Not Supported

Not Supported

SAS High-Performance Analytics Environment

✓ 

Not Supported

SAS LASR Analytic Server

✓[1]

Not Supported

Not Supported

Not Supported

Not Supported

Not Supported

SAS Data Loader for Hadoop

✓[2]

✓[2]

✓[4]

Not Supported

Not Supported

 

Footnotes:

[1] Kerberos not supported for IBM BigInsights

[2] Kerberos not supported for MapR

[3] Kerberos not supported if connecting to MapR from Windows host

[4] Sentry supported with CDH 5.11 and later releases

Hadoop High Availability

SAS currently supports HDFS HA and Hive HA with Cloudera and Hortonworks distributions.

SAS Products, Offerings, and TechnologiesHDFS HAHIVE HA

Base SAS: FILENAME Statement for Hadoop Access Method

n/a

Base SAS: HADOOP Procedure

n/a

Base SAS: SQOOP Procedure

SAS Scalable Performance Data Engine

n/a

SAS Scalable Performance Data Engine SerDe

SAS Scalable Performance Data Server

n/a

SAS/ACCESS Interface to Hadoop

SAS/ACCESS Interface to Impala

n/a

SAS In-Database Code Accelerator for Hadoop

SAS Scoring Accelerator for Hadoop

DATA Step Processing in Hadoop

SAS Grid Manager for Hadoop

n/a

SAS High-Performance Analytics Environment

SAS LASR Analytic Server

n/a

SAS Data Loader for Hadoop

 

SAS 9.4 M4

Details about the minimum supported versions for Hadoop distributions and Kerberos are provided in the following table.

Notes:

 

SAS Products, Offerings, and TechnologiesCloudera CDHHortonworks HDPMapR Distribution [7]Pivotal HDIBM BigInsightsNotes
Base SAS: FILENAME Statement for Hadoop Access Method5.8 or later2.5 or later [8]5.2 or later3.0 or later4.2 or later 
Base SAS: HADOOP Procedure5.8 or later2.5 or later [8]5.2 or later3.0 or later4.2 or later  
Base SAS: SQOOP Procedure [1]5.8 or later2.5 or later [8]5.2 or laterNot supportedNot supported 
SAS Scalable Performance Data Engine [2]5.8 or later2.5 or later [8]5.2 or later3.0 or later4.2 or later 
SAS Scalable Performance Data Engine SerDe5.8 or later2.5 or later [8]5.2 or later3.0 or later4.2 or laterDistribution must include Hive 0.13 or later
SAS Scalable Performance Data Server5.8 or later2.5 or later [8]5.2 or laterNot supportedNot supported 
SAS/ACCESS Interface to Hadoop5.8 or later2.5 or later [8]5.2 or later3.0 or later4.2 or laterIncludes support for In-Database Procedures for Hive.
SAS/ACCESS Interface to HAWQN/ANot supported [8]N/A3.0 or later [1]N/AMinimum supported version for HAWQ is 1.3.
SAS/ACCESS Interface to Impala5.8 or laterN/A [8]5.2 or laterN/AN/A 
SAS In-Database Code Accelerator for Hadoop [2]5.8 or later2.5 or later5.2 or later3.0 or later [3]4.2 or later 
SAS Scoring Accelerator for Hadoop [2]5.8 or later2.5 or later [8]5.2 or later3.0 or later [3]4.2 or later 
DATA Step Processing in Hadoop5.8 or later2.5 or later  [8]5.2 or later3.0 or later [3]4.2 or later 
SAS Grid Manager for Hadoop [2]5.8 or later [4]2.5 or later [4] [9]5.2 or later [5]Not supportedNot supported 
SAS High-Performance Analytics Environment [2]5.8 or later2.5 or later  [8]5.2 or later3.0 or later [3]4.2 or later 
SAS LASR Analytic Server4.7.0 or later1.3.2 or later  [8]4.0 or later2.1 or later [3]3.0 or later [1]Additionally, Apache Hadoop 0.23, 2.4.0, and 2.7.1 and later versions are supported as the Hadoop cluster that is co-located with SAS LASR for access to SASHDAT on HDFS.
SAS Data Loader for Hadoop [2]5.8 or later2.5 or later [8]
5.2 or later [1] [6]
3.0 or later [3]4.2 or laterOozie must be enabled in your Hadoop cluster.

Footnotes:

[1] Kerberos not supported.
[2] Connecting to a Kerberos-secured Hadoop cluster is not supported if the environment variable SAS_HADOOP_RESTFUL is set to 1.
[3] Only delimited text files, such as CSV files, are supported for data exchange between the Hadoop version and SAS software shown.
[4] Includes a REST API for job submission that results in better performance.
[5] Recommended. Includes support for cron syntax in coordinator frequency with Oozie. This functionality is needed in order to schedule flows with recurring time events.
[6] Only Linux for x64 SAS servers are supported.
[7] Kerberos not supported if connecting from a Windows host.
[8] Installation steps outline in the SAS Note 61703 must be applied when connecting to HDP 2.6.3 and later releases
[9] HDP 2.6.3 and later is not supported

SAS 9.4 M3

Details about the minimum supported versions for Hadoop distributions and Kerberos are provided in the following table.

Notes:

SAS Products, Offerings, and TechnologiesCloudera CDH [1]Hortonworks HDP [1]MapR Distribution [1]Pivotal HD [1]IBM BigInsights [2]Notes
Base SAS: FILENAME Statement for Hadoop Access Method4.7.0 or later1.3.2 or later [11]3.1 or later1.1.1 or later2.1 or later [3] 
Base SAS: HADOOP Procedure4.7.0 or later1.3.2 or later [11]3.1 or later1.1.1 or later2.1 or later [3] 
Base SAS: SQOOP Procedure [2]5.2.0 or later2.2 or later [11]4.0.2 or laterNot supportedNot supportedRequires Oozie 4.1.0 or later for best results.
SAS Scalable Performance Data Engine4.7.0 or later2.0 or later [11]4.0 or later2.1 or later3.0 or later 
SAS Scalable Performance Data Engine SerDe [8]5.2.0 or later2.2 or later [11]4.0 or laterNot supportedNot supported 
SAS Scalable Performance Data Server5.7.0 or later2.4 or later [11]5.1 or laterNot supportedNot supported 
SAS/ACCESS Interface to Hadoop4.7.0 or later1.3.2 or later [11]3.1 or later1.1.1 or later2.1 or later [3]Includes support for In-Database Procedures for Hive.
SAS/ACCESS Interface to HAWQN/AN/A [8] [11]N/A2.1 or later [2]N/AMinimum supported version for HAWQ is 1.2.1.
SAS/ACCESS Interface to Impala5.0.0 or laterN/A [8] [11]4.0 or later [5] [6]N/AN/ARequires a 2.0 Impala Server with a 2.5.22 Impala ODBC drive to support VARCHAR data type.

Support for In-Database Procedures for Impala is limited. [7]
SAS In-Database Code Accelerator for Hadoop5.0.0 or later [8]2.0 or later [8]4.0 or later [8]2.1 or later [4]3.0 or later [3] 
SAS Scoring Accelerator for Hadoop4.7.0 or later [8]1.3.2 or later [8] [11]4.0 or later [8]2.1 or later [4]3.0 or later [3] 
DATA Step Processing in Hadoop4.7.0 or later1.3.2 or later  [11]4.0.2 or later2.1 or later [4]3.0 or later [3] 
SAS Grid Manager for Hadoop5.2 or later
5.4 or later [9]
2.1 or later [12]
2.2 or later [9] [12]
4.0 or later
4.1 or later [10]
Not supportedNot supported 
SAS High-Performance Analytics Environment4.7.0 or later1.3.2 or later  [11]4.0 or later2.1 or later [4]3.0 or later [3] 
SAS LASR Analytic Server4.7.0 or later1.3.2 or later [11]4.0 or later2.1 or later [4]3.0 or later [3] 
SAS Data Loader for HadoopView supported Hadoop distributions and complete system requirements from the SAS Data Loader for Hadoop documentation page. [11]

 

Footnotes:

[1] Kerberos supported: MIT Kerberos 5 version 1.9 or later. Connecting to a Kerberos-secured Hadoop cluster is not supported if the environment variable SAS_HADOOP_RESTFUL is set to 1.
[2] Kerberos not supported.
[3] Prior to IBM BigInsights 4.2 only delimited text files, such as CSV files, are supported for data exchange between the Hadoop version and SAS software shown.
[4] Only delimited text files, such as CSV files, are supported for data exchange between the Hadoop version and SAS software shown.
[5] VARCHAR data type is not supported.
[6] PROC DS2 and PROC FEDSQL are not supported.
[7] Supports Cloudera CDH 5.2 only, and requires 2.0 Impala server with a 2.5.22 Impala ODBC driver.
[8] Requires Hive 0.13 to support extended file types such as ORC.
[9] Includes a REST API for job submission that results in better performance.
[10] Recommended. Includes support for cron syntax in coordinator frequency with Oozie. This functionality is needed in order to schedule flows with recurring time events.
[11] Installation steps outline in the SAS Note 61703 must be applied when connecting to HDP 2.6.3 and later releases
[12] HDP 2.6.3 and later is not supported

SAS Support for Alternative Releases of Hadoop Distributions

SAS documents the specific set of Hadoop distributions supported with each SAS product release. SAS also documents the minimum release version of each distribution supported with that SAS product release. These represent the specific distribution versions that SAS used to test and validate the associated SAS product release. Each supported Hadoop distribution release is documented using the "dot" version used to test and validate the associated SAS product.

Unless otherwise noted, SAS only supports minor releases within the documented major release.

Please note that if a customer elects to use a version that is later than a documented Hadoop version, different configuration steps than those described in the associated SAS product’s installation documentation are likely to be required. These can include but are not necessarily limited to

  • Hadoop client runtime file sets and locations.
  • Hadoop configuration files settings and locations.
  • Hadoop and SAS environment variable and system option settings.

Some SAS products that integrate with Hadoop may be delivered to customers with a fixed product architecture and deployment configuration design. Such a design cannot be modified to allow use with alternative Hadoop release versions.

SAS expects customers to have the appropriate skills to resolve differences between the supported release and the alternative release being used. By electing to use an alternative release, the customer acknowledges that they have made the appropriate investment into resolving the differences inherent in that alternative release.

All attempts to re-create any problematic customer scenario at SAS will be done using an officially supported Hadoop version release.

If SAS is unable to reproduce the problem, the customer will be required to perform further diagnostics on their own to isolate the problem up to and including reproducing the problem using a supported Hadoop version release.

Support for Apache Hadoop Software Distributed with SAS® Software

Certain SAS software requires a Hadoop distribution as a pre-requisite. Prior to SAS 9.4 TS1M4, as a convenience for customers, Apache Hadoop software was distributed with that SAS software, as documented in License Information for Third-Party Software Distributed with SAS® Software.

SAS provides support for the integration of Apache Hadoop delivered with SAS software. SAS does not provide support for the installation, or other aspects of the administration and operation, of Apache Hadoop. For production environments, customers should seek out a well-supported third-party distribution of Hadoop. This ensures that they can turn to a dedicated Hadoop vendor for assistance with their production Hadoop needs.

Recommended Resources


Explore our training options, including on-site classroom, live web, e-learning and one-on-one mentoring.


Validate your SAS knowledge and skills by earning a globally recognized credential from SAS.


Explore documentation on topics of interest to SAS administrators.
 

Back to Top