The SAS Data in
HDFS engine
is used with SAS High-Performance Deployment of Hadoop only.
The engine is designed as a write-only engine for transferring data to HDFS. However, SAS High-Performance Analytics procedures
are designed to read data in parallel from a
co-located data provider. The LASR procedure, and other procedures such as HPREG and HPLOGISTIC, can read data from HDFS with the engine. The HPDS2
procedure is designed to read data and write data in parallel. The HPDS2 procedure can be used with the engine to read data
from HDFS and create new tables in HDFS.
Whenever a SAS High-Performance Analytics procedure is used to create data in HDFS, the procedure creates the data with a
default block size of 8 megabytes.
This
size can be overridden with the BLOCKSIZE= data set option.