High-Performance Features of the OPTGRAPH Procedure


Alongside-the-Database Distributed Mode

Distributed mode is a computing mode in which several nodes in a distributed computing environment participate in the computations. In distributed mode, the OPTGRAPH procedure performs analytics on the database management system (DBMS) appliance. The OPTGRAPH procedure in SAS High-Performance Network Algorithms supports the alongside-the-database model of distributed execution, in which the data are stored in the distributed database and read in parallel from the DBMS.

When the input data are stored in the DBMS and the grid host is the appliance that houses the data, the OPTGRAPH procedure creates a distributed computing environment in which the analytic process is co-located with the nodes of the DBMS. PROC OPTGRAPH then passes data from the DBMS to the analytic process on each node. Instead of moving the data across the network and possibly back to the client machine, PROC OPTGRAPH passes the data locally between the processes on each node of the appliance.

Because the analytic processes on the appliance are separate from the database processes, the technique is referred to as alongside-the-database execution, in contrast to in-database execution, where the analytic code executes within the database process.

Before you can run PROC OPTGRAPH alongside the database, you must distribute the data to the appliance. This step is described in the section Distributing Input Data to the Appliance. In the alongside-the-database model, the number of compute nodes is determined by the layout of the database and cannot be modified. Therefore, if you specify a NODES= option in the PERFORMANCE statement in distributed mode, PROC OPTGRAPH ignores it. (Some SAS high-performance procedures support a NODES= option in the PERFORMANCE statement to control the number of compute nodes used; this option is valid only when the procedure passes data from the client to the appliance.)