SPD Server provides a multi-user, high-performance data delivery environment that enables you
to interact with
Hadoop through the
Hadoop Distributed File System (
HDFS). Using SPD Server with SAS applications, you can read, write, and update tables
in HDFS.
SPD Server provides the following benefits when used with Hadoop:
-
SPD Server organizes data into a streamlined file
format that has advantages for a distributed file system like HDFS. Data is separate from
the metadata, and the file format partitions the data.
-
SPD Server supports
parallel processing. The server reads data stored in HDFS by running multiple threads in parallel.
-
SPD Server provides a multi-user environment.
-
SPD Server is a full 64-bit server that supports up to two billion columns and (for
all practical
purposes) unlimited rows of data. SPD Server tables are stored on disk in a format
that enhances access and supports any large table requirements for SAS 9.4. SPD Server
cluster tables are a unique design feature of SPD Server that further enhances managing large
tables by enabling the user to create a virtual table that consists of several SPD
Server tables.
-
SPD Server uses access control lists (ACLs) and SPD Server user IDs to secure
domain resources.
-
If the Hadoop cluster supports Kerberos, SPD Server honors Kerberos ticket, cache-based,
logon
authentication and authorization as long as the Hadoop cluster configuration files are accessible.