The HPSUMMARY Procedure

PROC HPSUMMARY Features

PROC HPSUMMARY provides data summarization tools to compute descriptive statistics for variables across all observations and within groups of observations. For example, PROC HPSUMMARY does the following:

  • calculates descriptive statistics based on moments

  • calculates and estimates quantiles, which includes the median

  • calculates confidence limits for the mean

  • identifies extreme values

  • performs a $t$ test

PROC HPSUMMARY does not display output. You can use the OUTPUT statement to store the statistics in a SAS data set.

PROC HPSUMMARY provides a vehicle for the parallel execution of summarization in a distributed computing environment. The following list summarizes the basic features of PROC HPSUMMARY:

  • provides the ability to execute summarization in parallel

  • enables you to control the level of parallelism per execution node and the number of nodes to engage

  • is highly multithreaded

  • manages data migration to the location of execution and movement back to the client machine as needed

Because the HPSUMMARY procedure is a high-performance analytical procedure, it also does the following:

  • enables you to run in distributed mode on a cluster of machines that distribute the data and the computations

  • enables you to run in single-machine mode on the server where SAS is installed

  • exploits all the available cores and concurrent threads, regardless of execution mode

For more information, see the section Processing Modes in Chapter 3: Shared Concepts and Topics.