This course teaches you how to use SAS programming methods to read, write, and manipulate Hadoop data. Base SAS methods that are covered include reading and writing raw data with the DATA step and managing the Hadoop file system and executing Pig code from SAS via the HADOOP procedure. In addition, the SAS/ACCESS Interface to Hadoop methods that allow LIBNAME access and SQL pass-through techniques to read and write Hadoop Hive table structures are discussed. Although not covered in detail, a brief overview of additional SAS and Hadoop technologies, including executing DS2 in Hadoop with the SAS Code Accelerator for Hadoop and using Hadoop data sources for the distributed in-memory analytics platform of SAS Viya, is included. This course is included in the Expert Exchange on Hadoop: Using SAS/ACCESS service offering to configure SAS/ACCESS Interface to Hadoop or SAS/ACCESS Interface to Impala to work with your Hadoop environment.
Learn how to
- Read and write Hadoop files with the FILENAME statement.
- Execute and use Hadoop commands with the HADOOP procedure.
- Invoke the execution of Pig programs in Hadoop within a SAS program.
- Access Hadoop distributions using the LIBNAME statement and the SQL pass-through facility.
- Create and use SQL procedure pass-through queries.
- Use options and efficiency techniques for optimizing data access performance.
- Join data using the SQL procedure and the DATA step.
- Use Base SAS procedures with Hadoop.
- Modify DS2 programs to execute in-database in Hadoop.
- Use data in Hadoop as disk storage for SAS Viya in-memory tables.
Who should attend
SAS programmers that need to access data in Hadoop from within SAS
Before attending this course, you should be comfortable programming in SAS and Structured Query Language (SQL). You can gain the required SAS programming knowledge from the SAS Programming 1: Essentials course. You can gain the required knowledge of SQL from the SAS SQL 1: Essentials course. A working knowledge of Hadoop is helpful.
This course addresses Base SAS, SAS Data Connect Accelerator for Hadoop, SAS Data Connector to Hadoop software.
This course addresses Base SAS methods for Hadoop and SAS/ACCESS Interface to Hadoop.