If the grid host is a cluster that houses data that have been distributed by using the SASHDAT engine, then high-performance analytical procedures can analyze those data in the alongside-HDFS mode. The procedures use the distributed computing environment in which an analytic process is collocated with the nodes of the cluster. Data then pass from HDFS to the analytic process on each node of the cluster.
Before you can run a procedure alongside HDFS, you must distribute the data to the cluster. The following statements use the
SASHDAT engine to distribute to HDFS the simData
data set that was used in the previous two sections:
option set=GRIDHOST="hpa.sas.com"; libname hdatLib sashdat path="/hps"; data hdatLib.simData (replace = yes) ; set simData; run;
In this example, the GRIDHOST is a cluster where the SAS Data in HDFS Engine is installed. If a data set that is named simData
already exists in the hps
directory in HDFS, it is overwritten because the REPLACE=YES data set option is specified. For more information about using
this LIBNAME statement, see the section "LIBNAME Statement for the SAS Data in HDFS Engine" in the
SAS LASR Analytic Server: Reference Guide.
The following HPLOGISTIC procedure statements perform the analysis in alongside-HDFS mode. These statements are almost identical to the PROC HPLOGISTIC example in the previous two sections, which executed in single-machine mode and alongside-the-database distributed mode, respectively.
Figure 3.10 shows the "Performance Information" and "Data Access Information" tables. You see that the procedure ran in distributed mode and that the input data were read in parallel symmetric mode. The numeric results shown in Figure 3.11 agree with the previous analyses shown in Figure 3.1, Figure 3.2, and Figure 3.5.
Figure 3.10: Alongside-HDFS Execution Performance Information
Figure 3.11: Alongside-HDFS Execution Model Information
Parameter Estimates | |||||
---|---|---|---|---|---|
Parameter | Estimate | Standard Error |
DF | t Value | Pr > |t| |
Intercept | 5.7011 | 0.2539 | Infty | 22.45 | <.0001 |
a 0 | -0.01020 | 0.06627 | Infty | -0.15 | 0.8777 |
a 1 | 0 | . | . | . | . |
b 0 | 0.7124 | 0.06558 | Infty | 10.86 | <.0001 |
b 1 | 0 | . | . | . | . |
c 0 | 0.8036 | 0.06456 | Infty | 12.45 | <.0001 |
c 1 | 0 | . | . | . | . |
x1 | 0.01975 | 0.000614 | Infty | 32.15 | <.0001 |
x2 | -0.04728 | 0.003098 | Infty | -15.26 | <.0001 |
x3 | -0.1017 | 0.009470 | Infty | -10.74 | <.0001 |