The HPDS2 Procedure

Overview: HPDS2 Procedure

The HPDS2 procedure enables you to submit DS2 language statements from a Base SAS session to one or more machines in a grid for parallel execution. PROC HPDS2 verifies the syntactic correctness of the DS2 source on the client machine before submitting it for execution. The output data created by the DS2 DATA statement can be output in either of the following ways: it can be written in parallel back to the grid data store or it can be returned to the client machine and directed to any data store that is supported by SAS.

Because the DS2 code is executed in parallel on separate grid nodes that have single data partitions, each node produces separate output that is the result of processing only the local data partition. As a result, it might be necessary to use a second-stage program to aggregate the results from each node. The second stage can be executed on the SAS client by using the DS2 procedure, where the SET statement reads all rows created by the preceding parallel stage.

The syntax of DS2 is similar to that of the DATA step, but it does not include several key statements such as INPUT and MERGE. In addition, using DS2 along with SAS high-performance analytical procedures limits the PROC DS2 SET statement to a single input stream. The use of BY processing within the SET statement is also not supported. Therefore, many of the traditional DATA step data preparation features are not available in the HPDS2 procedure. PROC HPDS2 is most useful when significant amounts of computationally intensive, row-independent logic must be applied to the data.

For more information about the DS2 language, see SAS DS2 Language Reference, which is available at http://support.sas.com/documentation/solutions/ds2/DS2Ref.pdf.

PROC HPDS2 runs in either single-machine mode or distributed mode.

Note: Distributed mode requires SAS High-Performance Server Distributed Mode .