The HPIMPUTE Procedure

Single-Machine and Distributed Execution Modes

The HPIMPUTE procedure can exploit computer grids by imputing independently on different grid nodes in parallel, and it supports multithreading on each node. For more information about single-machine and distributed execution modes, see the section Processing Modes in ChapterĀ 2: Shared Concepts and Topics.

You can control both the number of parallel threads per execution node and the number of computing nodes to engage.

Alternatively, PROC HPIMPUTE can be executed on a grid of distributed computers. In distributed mode, one or more copies of the imputation code are executed in parallel on each grid node.

The distributed mode of execution has two variations:

  • In the client-data (local-data) model of distributed execution, the input data are not stored on the appliance but are distributed to the distributed computing environment during execution of the HPIMPUTE procedure.

  • In the alongside-the-database model of distributed execution, the data source is the database on the appliance. The data are stored in the distributed database, and the imputation code that runs on each node can read and write the data in parallel during execution of the HPIMPUTE procedure. Instead of being moved across the network and possibly back to the client machine, data are passed locally between the processes on each node of the appliance. In general and especially with large data sets, the best PROC HPIMPUTE performance can be achieved if execution is alongside the database.