Partitioning
is available only when you add tables to HDFS. If you partition the
table when you add it to HDFS, it becomes a partitioned in-memory
table when you load it to SAS LASR Analytic Server. If you
also specify the ORDERBY= option, then the ordering is preserved when
the table is loaded to memory too.
Partition keys are derived
based on the formatted values in the order of the variable names in
the
variable-list. All of the
rows with the same partition key are stored in a single block. This
ensures that all the data for a partition is loaded into memory on
a single machine in the cluster. The blocks are replicated according
to the default replication factor or the value that you specify for
the COPIES= option.
If
user-defined formats are used, then the format name is stored with
the table, but not the format. The format for the variable must be
available to the SAS LASR Analytic Server when the table is
loaded into memory. This can be done by having the format in the format
catalog search path for the SAS session.
Be aware that the key
construction is not hierarchical. That is, PARTITION=(A B) specifies
that any unique combination of formatted values for variables A and
B defines a partition.
Partitioning by a variable
that does not exist in the output table is an error. Partitioning
by a variable listed in the ORDERBY= option is also an error.