Overview: HADOOP Procedure

PROC HADOOP enables SAS to run Apache Hadoop code against Hadoop data. Apache Hadoop is an open-source technology, written in Java, that provides data storage and distributed processing of large amounts of data.
PROC HADOOP interfaces with the Hadoop JobTracker. This is the service within Hadoop that controls tasks to specific nodes in the cluster. PROC HADOOP enables you to submit the following:
  • Hadoop Distributed File System (HDFS) commands
  • MapReduce programs
  • Pig language code