Performance Considerations for Hadoop Transformations

SAS Technical Papers about Hadoop

For a list of papers highlighting the most current trips, tricks, and ways to improve your performance with Hadoop, see Hadoop and SAS. For some additional issues specific to SAS Data Integration Studio, see to the following topics.

Use INSERT INTO SQL for HIVE

When appending rows from a hive source table to a hive target table, use the INSERT INTO SQL statement rather than the APPEND procedure for optimal performance. Accordingly, when appending rows from a hive source table to a hive target in a SAS Data Integration Studio job, use the SQL INSERT transformation.
Last updated: January 16, 2018