By
default, the SAS Data in HDFS engine distributes data
in 2-megabyte blocks or the length of a record, which ever is greater.
You can override this value by specifying the block size to use. Suffix
values are B (bytes), K (kilobytes), M (megabytes), and G (gigabytes).
The actual block size is slightly larger than the value that you specify.
This occurs for any of the following reasons:
-
to reach the record length. This
occurs if the specified size is less than the record length.
-
to align on a 512-byte boundary.
-
to include a metadata header in
HDFS for the SASHDAT file.
The following code
shows an example of specifying the BLOCKSIZE= option.
Using the BLOCKSIZE= Data Set Option
data hdfs.sales (blocksize=48M);
set yr2012.sales;
run;