Understanding SPD Server with Hadoop

SPD Server provides a multi-user, high-performance data delivery environment that enables you to interact with Hadoop through the Hadoop Distributed File System (HDFS). Using SPD Server with SAS applications, you can read, write, and update tables in HDFS.
SPD Server provides the following benefits when used with Hadoop:
  • SPD Server organizes data into a streamlined file format that has advantages for a distributed file system like HDFS. Data is separate from the metadata, and the file format partitions the data.
  • SPD Server supports parallel processing. The server reads data stored in HDFS by running multiple threads in parallel.
  • SPD Server provides a multi-user environment.
  • SPD Server is a full 64-bit server that supports up to two billion columns and (for all practical purposes) unlimited rows of data. SPD Server tables are stored on disk in a format that enhances access and supports any large table requirements for SAS 9.4. SPD Server cluster tables are a unique design feature of SPD Server that further enhances managing large tables by enabling the user to create a virtual table that consists of several SPD Server tables.
  • SPD Server uses access control lists (ACLs) and SPD Server user IDs to secure domain resources.
  • If the Hadoop cluster supports Kerberos, SPD Server honors Kerberos ticket, cache-based, logon authentication and authorization as long as the Hadoop cluster configuration files are accessible.