The HPSUMMARY Procedure


PROC HPSUMMARY provides data summarization tools to compute descriptive statistics for variables across all observations and within groups of observations. For example, PROC HPSUMMARY does the following:

  • calculates descriptive statistics based on moments

  • calculates and estimates quantiles, which includes the median

  • calculates confidence limits for the mean

  • identifies extreme values

  • performs a t test

PROC HPSUMMARY does not display output. You can use the OUTPUT statement to store the statistics in a SAS data set.

PROC HPSUMMARY provides a vehicle for the parallel execution of summarization in a distributed computing environment. The following list summarizes the basic features of PROC HPSUMMARY:

  • provides the ability to execute summarization in parallel

  • enables you to control the level of parallelism per execution node and the number of nodes to engage

  • is highly multithreaded

  • manages data migration to the location of execution and movement back to the client machine as needed

Because the HPSUMMARY procedure is a high-performance analytical procedure, it also does the following:

  • enables you to run in distributed mode on a cluster of machines that distribute the data and the computations

  • enables you to run in single-machine mode on the server where SAS is installed

  • exploits all the available cores and concurrent threads, regardless of execution mode

For more information, see the section Processing Modes in ChapterĀ 2: Shared Concepts and Topics.