SAS Data Loader provides
a script for deploying your QKB on the Hadoop cluster. Before you
can run this script, you must copy your QKB to the Hadoop cluster. This
can be done by transferring the directory structure to the Hadoop
master node via FTP, or by mounting the file system where the QKB
is located on the Hadoop master node.
It is recommended that
you run the script, named qkbpush.sh, on the Hadoop master node (Name
Node). The script automatically discovers all nodes in the cluster
and deploys the QKB on them by default. Flags are available to enable
you to deploy the QKB on individual nodes, or on a subset of nodes
instead.
The qkbpush.sh script
performs two tasks:
-
It copies the specified QKB directory
to a fixed location (/opt/qkb/default
)
on the specified nodes and sets the QKB’s permissions so that
the QKB is owned by the user account that is owned by the SAS Embedded
Process.
-
It generates an index file from
the contents of the QKB and pushes this index file to HDFS. This index
file, named default.idx, is created in the /sas/qkb
directory
in HDFS. The default.idx file provides a list of QKB definition and
token names to SAS Data Loader. SAS Data Loader surfaces the names
in its graphical user interface.
Only one QKB and one
index file are supported in the Hadoop framework at a time. Subsequent
QKB and index pushes replace prior ones.
After the QKB deployment
is complete, you must restart the SAS Embedded Process on each Hadoop
node so that each instance of the SAS Embedded Process loads the newly
deployed QKB. Use the sasep-servers.sh script to restart the SAS Embedded
Process. For information about the sasep-servers.sh script, see the
information for Hadoop in the
SAS In-Database Products: Administrator's Guide.