PROC HPCANDISC shows its real power when the computation is conducted using multiple threads or in a distributed environment.
This example shows how you can run PROC HPCANDISC in single-machine and distributed modes. For more information about the execution modes of SAS high-performance analytics procedures, see the section Processing Modes in Chapter 3: Shared Concepts and Topics. The focus of this example is to show how you can switch the modes of execution in PROC HPCANDISC. The following DATA step generates the data:
data ex2Data; drop i j n n1 n2 n3 n4; n = 5000000; n1 = n*0.1; n2 = n*0.25; n3 = n*0.45; n4 = n*0.7; array x{20}; do i=1 to n; do j=1 to dim(x); x{j} = ranuni(1); end; if i <= n1 then z='small'; else if i <= n2 then z='medium'; else if i <= n3 then z='big'; else if i <= n4 then z='verybig'; else z='huge'; output; end; run;
The following statements use PROC HPCANDISC to perform a canonical discriminant analysis and to output various statistics
to the stats
data set (OUTSTAT= stats
).
proc hpcandisc data=ex2Data outstat=stats; var x:; class z; performance details; run;
Output 5.2.1 shows the “Performance Information” table. This table shows that the HPCANDISC procedure executes in single-machine mode on four threads, because the client machine has four CPUs. You can force a certain number of threads on any machine to be involved in the computations by specifying the NTHREADS option in the PERFORMANCE statement.
Output 5.2.1: Performance Information in Single-Machine Mode
Performance Information | |
---|---|
Execution Mode | Single-Machine |
Number of Threads | 4 |
Output 5.2.2 shows timing information for the PROC HPCANDISC run. This table is produced when you specify the DETAILS option in the PERFORMANCE statement. You can see that, in this case, the majority of time is spent reading, levelizing, and processing the data.
Output 5.2.2: Timing in Single-Machine Mode
Procedure Task Timing | ||
---|---|---|
Task | Seconds | Percent |
Reading, Levelizing, and Processing Data | 62.78 | 99.85% |
Performing Canonical Analysis | 0.06 | 0.10% |
Producing Output Statistics Data Set | 0.03 | 0.05% |
To switch to running PROC HPCANDISC in distributed mode, specify valid values for the NODES=, INSTALL=, and HOST= options in the PERFORMANCE statement. An alternative to specifying the INSTALL= and HOST= options in the PERFORMANCE statement is to use the OPTIONS SET commands to set appropriate values for the GRIDHOST and GRIDINSTALLLOC environment variables. For information about setting these options or environment variables, see the section Processing Modes in Chapter 3: Shared Concepts and Topics.
The following statements provide an example. To run these statements successfully, you need to set the macro variables GRIDHOST
and GRIDINSTALLLOC
to resolve to appropriate values, or you can replace the references to macro variables with appropriate values.
proc hpcandisc data=ex2Data outstat=stats; var x:; class z; performance details nodes = 4 host="&GRIDHOST" install="&GRIDINSTALLLOC"; run;
The execution mode in the “Performance Information” table shown in Output 5.2.3 indicates that the calculations were performed in a distributed environment that uses four nodes, each of which uses 32 threads.
Output 5.2.3: Performance Information in Distributed Mode
Performance Information | |
---|---|
Host Node | << your grid host >> |
Install Location | << your grid install location >> |
Execution Mode | Distributed |
Grid Mode | Symmetric |
Number of Compute Nodes | 4 |
Number of Threads per Node | 32 |
Another indication of distributed execution is the following message issued by all high-performance analytics procedures in the SAS log:
NOTE: The HPCANDISC procedure is executing in the distributed computing environment with 4 worker nodes.
Output 5.2.4 shows timing information for this distributed run of the HPCANDISC procedure. In contrast to the single-machine mode (where reading, levelizing, and processing the data dominated the time spent), the majority of time in the distributed mode run is spent distributing the data.
Output 5.2.4: Timing in Distributed Mode
Procedure Task Timing | ||
---|---|---|
Task | Seconds | Percent |
Obtaining Settings | 0.00 | 0.00% |
Distributing Data | 7.56 | 88.91% |
Reading, Levelizing, and Processing Data | 0.84 | 9.86% |
Computing SSCP and Covariance Matrices | 0.00 | 0.00% |
Performing Canonical Analysis | 0.00 | 0.01% |
Producing Output Statistics Data Set | 0.01 | 0.13% |
Waiting on Client | 0.09 | 1.08% |