Distributed Server: Parallel Load

Introduction

SAS Visual Analytics loads data in parallel whenever possible. This topic outlines the parallel load methods that SAS Visual Analytics can support.
Note: Not all methods and providers are configured and available in all deployments. See the SAS Visual Analytics: Installation and Configuration Guide (Distributed SAS LASR).

Method: Co-located Storage

Topology:
The storage and analytics nodes must be on the same machines.
Provider:
Pattern:
Symmetric. There must be a one-to-one mapping between storage and analytics nodes.
SASHDAT:
In co-located HDFS, data is staged in SASHDAT format.
The HDFS source path in dot-delimited format or the legacy libref.
Usage:
See Administrator Load or use the data builder.

Method: NFS-Mounted Storage

Topology:
The storage cluster can be separate from the analytics cluster.1
Provider:
MapR. See MapR Distribution for Apache Hadoop in the SAS LASR Analytic Server: Reference Guide.
Pattern:
Asymmetric. One-to-one mapping between storage and analytics nodes is not required.
SASHDAT:
Data is staged in SASHDAT format.
The NFS source path in dot-delimited format.
Usage:
See Administrator Load or use the data builder.
1Regardless of topology, the SAS LASR Analytic Server accesses data as if it is co-located.

Method: SAS Embedded Process

Topology:
The storage cluster can be separate from the analytics cluster.
Provider:
Various.2
Pattern:
Asymmetric. One-to-one mapping between storage and analytics nodes is not required.
SASHDAT:
Data is not staged in SASHDAT format.
Any valid libref.
Usage:
See Administrator Load, use the data builder, or use an import action.1
1Load is parallel if embedded processing is available, LASR table name matches source table name, and server tag is valid as a SAS libref.
2See the SAS High-Performance Analytics Infrastructure: Installation and Configuration Guide.

Example Depictions

The following figures depict staging to and loading from co-located HDFS:
Stage to Co-located Storage
add tableA to co-located HDFS
Load from Co-located Storage
load tableA from co-located HDFS
For NFS-mounted MapR, the stage and load processes are similar to the preceding example, except as follows:
  • The storage and analytics clusters can be separate.
  • The metadata objects would have different names.
The following figure depicts an import action that uses SAS Embedded Process:
Import Using SAS Embedded Process
parallel import using SAS Embedded Process
Last updated: December 18, 2018