The HPCANDISC Procedure

Example 5.2 Performing Canonical Discriminant Analysis in Single-Machine and Distributed Modes

PROC HPCANDISC shows its real power when the computation is conducted using multiple threads or in a distributed environment.

This example shows how you can run PROC HPCANDISC in single-machine and distributed modes. For more information about the execution modes of SAS high-performance analytics procedures, see the section Processing Modes. The focus of this example is to show how you can switch the modes of execution in PROC HPCANDISC. The following DATA step generates the data:

data ex2Data;
   drop i j n n1 n2 n3 n4;

   n  = 5000000;
   n1 = n*0.1;
   n2 = n*0.25;
   n3 = n*0.45;
   n4 = n*0.7;

   array x{100};

   do i=1 to n;
      do j=1 to dim(x);
         x{j} = ranuni(1);
      end;

      if      i <= n1 then z='small';
      else if i <= n2 then z='medium';
      else if i <= n3 then z='big';
      else if i <= n4 then z='verybig';
      else                 z='huge';

      output;
   end;
run;

The following statements use PROC HPCANDISC to perform a canonical discriminant analysis and to output various statistics to the stats data set (OUTSTAT= stats).


proc hpcandisc data=ex2Data outstat=stats;
   var x:;
   class z;
   performance details;
run;

Output 5.2.1 shows the "Performance Information" table. This table shows that the HPCANDISC procedure executes in single-machine mode on four threads, because the client machine has four CPUs. You can force a certain number of threads on any machine to be involved in the computations by specifying the NTHREADS= option in the PERFORMANCE statement.

Output 5.2.1: Performance Information in Single-Machine Mode

The HPCANDISC Procedure

Performance Information
Execution Mode	Single-Machine
Number of Threads	4

Output 5.2.2 shows timing information for the PROC HPCANDISC run. This table is produced when you specify the DETAILS option in the PERFORMANCE statement. You can see that, in this case, the majority of time is spent reading, levelizing, and processing the data.

Output 5.2.2: Timing in Single-Machine Mode

Procedure Task Timing
Task	Seconds	Percent
Reading, Levelizing, and Processing Data	73.15	99.26%
Computing SSCP and Covariance Matrices	0.00	0.00%
Performing Canonical Analysis	0.48	0.65%
Producing Output Statistics Data Set	0.07	0.09%

To switch to running PROC HPCANDISC in distributed mode, specify valid values for the NODES=, INSTALL=, and HOST= options in the PERFORMANCE statement. An alternative to specifying the INSTALL= and HOST= options in the PERFORMANCE statement is to use the OPTIONS SET commands to set appropriate values for the GRIDHOST and GRIDINSTALLLOC environment variables. For information about setting these options or environment variables, see the section Processing Modes.

The following statements provide an example. To run these statements successfully, you need to set the macro variables GRIDHOST and GRIDINSTALLLOC to resolve to appropriate values, or you can replace the references to macro variables with appropriate values.

proc hpcandisc data=ex2Data outstat=stats;
   var x:;
   class z;
   performance details nodes = 4
               host="&GRIDHOST" install="&GRIDINSTALLLOC";
run;

The execution mode in the "Performance Information" table shown in Output 5.2.3 indicates that the calculations were performed in a distributed environment that uses four nodes, each of which uses 32 threads.

Output 5.2.3: Performance Information in Distributed Mode

Performance Information
Host Node	<< your grid host >>
Install Location	<< your grid install location >>
Execution Mode	Distributed
Number of Compute Nodes	4
Number of Threads per Node	32

Another indication of distributed execution is the following message issued by all high-performance analytics procedures in the SAS log:

NOTE: The HPCANDISC procedure is executing in the distributed
      computing environment with 4 worker nodes.

Output 5.2.4 shows timing information for this distributed run of the HPCANDISC procedure. In contrast to the single-machine mode (where reading, levelizing, and processing the data dominated the time spent), the majority of time in the distributed mode run is spent distributing the data.

Output 5.2.4: Timing in Distributed Mode

Procedure Task Timing
Task	Seconds	Percent
Obtaining Settings	0.00	0.00%
Distributing Data	61.53	94.60%
Reading, Levelizing, and Processing Data	3.12	4.79%
Computing SSCP and Covariance Matrices	0.00	0.00%
Performing Canonical Analysis	0.00	0.01%
Producing Output Statistics Data Set	0.15	0.23%
Waiting on Client	0.24	0.37%