Previous Page | Next Page

Creating Summary Tables with the TABULATE Procedure

Creating More Sophisticated Summary Tables


Creating Hierarchical Tables to Report on Subgroups

You can create a hierarchical table to report on subgroups of your data by crossing elements within a dimension. Crossing elements is the operation that combines two or more elements, such as class variables, analysis variables, format modifiers, statistics, or styles. Dimensions are automatically crossed. When you cross variables in a single dimension expression, values for one variable are placed within the values for the other variable in the same dimension. This forms a hierarchy of variables and, therefore, a hierarchical table. The order in which variables are listed when they are crossed determines the order of the headings in the table. In the column dimension, variables are stacked top to bottom; in the row dimension, left to right; and in the page dimension, front to back. You cross elements in a dimension expression by putting an asterisk between them. Note that two analysis variables cannot be crossed. Also, because dimensions are automatically crossed, all analysis variables must occur in one dimension.

The PROC TABULATE step that follows creates a two-dimensional summary table that crosses two variables and that answers the question, "What was the amount sold of each type of coffee maker by each sales representative?"

options linesize=84 pageno=1 nodate;

proc tabulate data=year_sales format=comma10.;
   title1 'TruBlend Coffee Makers, Inc.';
   title2 'Amount Sold Per Item by Each Sales Representative'; 
   class SalesRep Type;
   var AmountSold;
   table SalesRep*Type,
         AmountSold;
run;

The expression SalesRep*Type in the row dimension uses the asterisk operator to cross the values of the variable SalesRep with the values of the variable Type. Because SalesRep is listed before Type when crossed, and because the elements are crossed in the row dimension, values for Type will be listed to the right of values of SalesRep. Values for Type will be repeated for each value of SalesRep.

The following summary table displays the results:

Crossing Variables

                            TruBlend Coffee Makers, Inc.                           1
                 Amount Sold Per Item by Each Sales Representative

                          --------------------------------
                          |                   |AmountSold|
                          |                   |----------|
                          |                   |   Sum    |
                          |-------------------+----------|
                          |SalesRep |Type     |          |
                          |---------+---------|          |
                          |Garcia   |Deluxe   |    46,778|
                          |         |---------+----------|
                          |         |Standard |   465,293|
                          |---------+---------+----------|
                          |Hollings-|Deluxe   |    37,620|
                          |worth    |---------+----------|
                          |         |Standard |   309,626|
                          |---------+---------+----------|
                          |Jensen   |Deluxe   |    40,590|
                          |         |---------+----------|
                          |         |Standard |   420,573|
                          --------------------------------

Notice the hierarchy of values that are created when the values for Type are repeated to the right of each value of SalesRep.


Formatting Output

You can override formats in summary table output by crossing variables with format modifiers. You cross a variable with a format modifier by putting an asterisk between them.

The PROC TABULATE step that follows creates a two-dimensional summary table that crosses a variable with a format modifier and that answers the question, "What was the amount sold of each type of coffee maker by each sales representative?"

options linesize=84 pageno=1 nodate;

proc tabulate data=year_sales format=comma10.;
   title1 'TruBlend Coffee Makers, Inc.';
   title2 'Amount Sold Per Item by Each Sales Representative'; 
   class SalesRep Type;
   var AmountSold;
   table SalesRep*Type,
         AmountSold*f=dollar16.2;
run;

The expression AmountSold*f=dollar16.2 in the column dimension uses the asterisk operator to cross the values of the variable AmountSold with the SAS format modifier f=dollar16.2 . The values for AmountSold will now display using the DOLLAR16.2 format. The DOLLAR16.2 format is better suited for dollar figures than the COMMA10. format, which is specified as the default in the PROC TABULATE statement.

The following summary table displays the results:

Crossing Variables with Format Modifiers

                            TruBlend Coffee Makers, Inc.                           1
                 Amount Sold Per Item by Each Sales Representative

                       --------------------------------------
                       |                   |   AmountSold   |
                       |                   |----------------|
                       |                   |      Sum       |
                       |-------------------+----------------|
                       |SalesRep |Type     |                |
                       |---------+---------|                |
                       |Garcia   |Deluxe   |      $46,777.50|
                       |         |---------+----------------|
                       |         |Standard |     $465,293.28|
                       |---------+---------+----------------|
                       |Hollings-|Deluxe   |      $37,620.00|
                       |worth    |---------+----------------|
                       |         |Standard |     $309,626.10|
                       |---------+---------+----------------|
                       |Jensen   |Deluxe   |      $40,590.00|
                       |         |---------+----------------|
                       |         |Standard |     $420,572.60|
                       --------------------------------------

Calculating Descriptive Statistics

You can request descriptive statistics for a variable by crossing that variable with the appropriate statistic keyword. Crossing either a class variable or an analysis variable with a statistic tells PROC TABULATE what type of calculations to perform. Note that two statistics cannot be crossed. Also, because dimensions are automatically crossed, all statistics must occur in one dimension.

The default statistic crossed with a class variable is the N statistic or frequency. Class variables can only be crossed with frequency and percent frequency statistics. The default statistic crossed with an analysis variable is the SUM statistic. Analysis variables can be crossed with any of the many descriptive statistics that are available with PROC TABULATE including commonly used statistics like MIN, MAX, MEAN, STD, and MEDIAN. For a complete list of statistics available for use with analysis variables, see "Statistics Available in PROC TABULATE" in the Base SAS Procedures Guide.

The PROC TABULATE step that follows creates a two-dimensional summary table that crosses elements with a statistic and that answers the question, "What was the average amount per sale of each type of coffee maker by each sales representative?"

options linesize=84 pageno=1 nodate;

proc tabulate data=year_sales format=comma10.;
   title1 'TruBlend Coffee Makers, Inc.';
   title2 'Average Amount Sold Per Item by Each Sales Representative';
   class SalesRep Type;
   var AmountSold;
   table SalesRep*Type,
         AmountSold*mean*f=dollar16.2;
run;

In this program, the column dimension crosses the variable AmountSold with the statistic mean and with the format modifier f=dollar16.2 . The MEAN statistic provides the arithmetic mean for AmountSold.

The following summary table displays the results:

Crossing a Variable with a Statistic

                            TruBlend Coffee Makers, Inc.                           1
             Average Amount Sold Per Item by Each Sales Representative

                       --------------------------------------
                       |                   |   AmountSold   |
                       |                   |----------------|
                       |                   |      Mean      |
                       |-------------------+----------------|
                       |SalesRep |Type     |                |
                       |---------+---------|                |
                       |Garcia   |Deluxe   |      $11,694.38|
                       |         |---------+----------------|
                       |         |Standard |      $12,924.81|
                       |---------+---------+----------------|
                       |Hollings-|Deluxe   |       $4,702.50|
                       |worth    |---------+----------------|
                       |         |Standard |      $12,901.09|
                       |---------+---------+----------------|
                       |Jensen   |Deluxe   |      $10,147.50|
                       |         |---------+----------------|
                       |         |Standard |      $12,369.78|
                       --------------------------------------

Reporting on Multiple Statistics

You can create summary tables that report on two or more statistics by concatenating variables. Concatenating is the operation that joins the information of two or more elements, such as class variables, analysis variables, or statistics, by placing the output of the second and subsequent elements immediately after the output of the first element. You concatenate elements in a dimension expression by putting a blank space between them.

The PROC TABULATE step that follows creates a two-dimensional summary table that uses concatenation and that answers the question, "How many sales were made, and what was the total sales figure for each type of coffee maker sold by each sales representative?"

options linesize=84 pageno=1 nodate;

proc tabulate data=year_sales format=comma10.;
   title1 'TruBlend Coffee Makers, Inc.';
   title2 'Sales Summary by Representative and Product';
   class SalesRep Type;
   var AmountSold;
   table SalesRep*Type,
         AmountSold*n AmountSold*f=dollar16.2;
run;

In this program, because the expressions AmountSold*n and AmountSold*f=dollar16.2 in the column dimension are separated by a blank space, their output will be concatenated.

The following summary table displays the results:

Concatenating Variables

                            TruBlend Coffee Makers, Inc.                           1
                    Sales Summary by Representative and Product

                                          1              2 
                 -------------------------------------------------
                 |                   |AmountSold|   AmountSold   |
                 |                   |----------+----------------|
                 |                   |    N     |      Sum       |
                 |-------------------+----------+----------------|
                 |SalesRep |Type     |          |                |
                 |---------+---------|          |                |
                 |Garcia   |Deluxe   |         4|      $46,777.50|
                 |         |---------+----------+----------------|
                 |         |Standard |        36|     $465,293.28|
                 |---------+---------+----------+----------------|
                 |Hollings-|Deluxe   |         8|      $37,620.00|
                 |worth    |---------+----------+----------------|
                 |         |Standard |        24|     $309,626.10|
                 |---------+---------+----------+----------------|
                 |Jensen   |Deluxe   |         4|      $40,590.00|
                 |         |---------+----------+----------------|
                 |         |Standard |        34|     $420,572.60|
                 -------------------------------------------------

In this summary table the frequency (N) of AmountSold [1] is shown in the same table as the SUM of AmountSold [2].


Reducing Code and Applying a Single Label to Multiple Elements

You can use parentheses to group concatenated elements (variables, formats, statistics, and so on) that are concatenated or crossed with a common element. This can reduce the amount of code used and can change how labels are displayed. The PROC TABULATE step that follows uses parentheses to group elements that are crossed with AmountSold and answers the question, "How many sales were made, and what was the total sales figure for each type of coffee maker sold by each sales representative?"

options linesize=84 pageno=1 nodate;

proc tabulate data=year_sales format=comma10.;
   title1 'TruBlend Coffee Makers, Inc.';
   title2 'Sales Summary by Representative and Product';
   class SalesRep Type;
   var AmountSold;
   table SalesRep*Type,
         AmountSold*(n sum*f=dollar16.2);
run;

In this program, AmountSold*(n sum*f=dollar16.2) takes the place of AmountSold*n AmountSold*f=dollar16.2 . Notice the default statistic SUM from AmountSold*f=dollar16.2 must now be included in the expression. This is because the format modifier must be crossed with a variable or a statistic. It cannot be in the expression by itself.

The following summary table displays the results:

Using Parentheses to Group Elements

                            TruBlend Coffee Makers, Inc.                           1
                    Sales Summary by Representative and Product

                 -------------------------------------------------
                 |                   |        AmountSold         |
                 |                   |---------------------------|
                 |                   |    N     |      Sum       |
                 |-------------------+----------+----------------|
                 |SalesRep |Type     |          |                |
                 |---------+---------|          |                |
                 |Garcia   |Deluxe   |         4|      $46,777.50|
                 |         |---------+----------+----------------|
                 |         |Standard |        36|     $465,293.28|
                 |---------+---------+----------+----------------|
                 |Hollings-|Deluxe   |         8|      $37,620.00|
                 |worth    |---------+----------+----------------|
                 |         |Standard |        24|     $309,626.10|
                 |---------+---------+----------+----------------|
                 |Jensen   |Deluxe   |         4|      $40,590.00|
                 |         |---------+----------+----------------|
                 |         |Standard |        34|     $420,572.60|
                 -------------------------------------------------

Note that the label, AmountSold, spans multiple columns rather than appearing twice in the summary table, as it does in Concatenating Variables.


Getting Summaries for All Variables

You can summarize all of the class variables in a dimension with the universal class variable ALL. ALL can be concatenated with each of the three dimensions of the TABLE statement and within groups of elements delimited by parentheses. The PROC TABULATE step that follows creates a two-dimensional summary table with the universal class variable ALL, and answers the question, "For each sales representative and for all of the sales representatives as a group, how many sales were made, what was the average amount per sale, and what was the amount sold?"

options linesize=84 pageno=1 nodate;

proc tabulate data=year_sales format=comma10.;
   title1 'TruBlend Coffee Makers, Inc.';
   title2 'Sales Report';
   class SalesRep Type;
   var AmountSold;
   table SalesRep*Type all,
         AmountSold*(n (mean sum)*f=dollar16.2);
run;

In this program, the TABLE statement now includes the universal class variable ALL in the row dimension. SalesRep and Type will be summarized.

The following summary table displays the results:

Crossing with the Universal Class Variable ALL

                            TruBlend Coffee Makers, Inc.                           1
                                    Sales Report

         ------------------------------------------------------------------
         |                   |                 AmountSold                 |
         |                   |--------------------------------------------|
         |                   |    N     |      Mean      |      Sum       |
         |-------------------+----------+----------------+----------------|
         |SalesRep |Type     |          |                |                |
         |---------+---------|          |                |                |
         |Garcia   |Deluxe   |         4|      $11,694.38|      $46,777.50|
         |         |---------+----------+----------------+----------------|
         |         |Standard |        36|      $12,924.81|     $465,293.28|
         |---------+---------+----------+----------------+----------------|
         |Hollings-|Deluxe   |         8|       $4,702.50|      $37,620.00|
         |worth    |---------+----------+----------------+----------------|
         |         |Standard |        24|      $12,901.09|     $309,626.10|
         |---------+---------+----------+----------------+----------------|
         |Jensen   |Deluxe   |         4|      $10,147.50|      $40,590.00|
         |         |---------+----------+----------------+----------------|
         |         |Standard |        34|      $12,369.78|     $420,572.60|
         |-------------------+----------+----------------+----------------|
         |All 1              |       110|      $12,004.36|   $1,320,479.48|
         ------------------------------------------------------------------

This summary table reports the frequency (N), the MEAN, and the SUM of AmountSold for each category of SalesRep and Type. This data has been summarized for all categories of SalesRep and Type in the row labeled All [1].


Defining Labels

You can add your own labels to a summary table or remove headings from a summary table by assigning labels to variables in the TABLE statement. Simply follow the variable with an equal sign (=) followed by either the desired label or by a blank space in quotation marks. A blank space in quotation marks removes the heading from the summary table. The PROC TABULATE step that follows creates a two-dimensional summary table that uses labels in the TABLE statement and that answers the question, "What is the percent of total sales and average amount sold by each sales representative of each type of coffee maker and all coffee makers?"

options linesize=84 pageno=1 nodate;

proc tabulate data=year_sales format=comma10.;
   title1 'TruBlend Coffee Makers, Inc.';
   title2 'Sales Performance';
   class SalesRep Type;
   var AmountSold;
   table SalesRep='Sales Representative'1 *
         (Type='Type of Coffee Maker'1  all) all,
         AmountSold=' '4 *
         (N='Sales'2  
         SUM='Amount'2 *f=dollar16.2 
         colpctsum='% Sales'3  
         mean='Average Sale'2 *f=dollar16.2);
run;

The numbered items in the previous program correspond to the following:

[1] The variables SalesRep and Type are assigned labels.

[2] The frequency statistic N, the statistic SUM, and the statistic MEAN are assigned labels.

[3] The statistic COLPCTSUM is used to calculate the percentage of the value in a single table cell in relation to the total of the values in the column and is assigned the label `% Sales'.

[4] The variable AmountSold is assigned a blank label. As a result, the heading for AmountSold does not appear in the summary table.

The following summary table displays the results:

Using Labels to Customize Summary Tables

                            TruBlend Coffee Makers, Inc.                           1
                                 Sales Performance
                                                   1 
   ----------------------------------------------------------------------------
   |                2 |  Sales   |     Amount     | % Sales  |  Averag Sale   |
   |-------------------+----------+----------------+----------+----------------|
 3 |Sales   |Type of   |          |                |          |                |
   |Represen-|Coffee   |          |                |          |                |
   |tative   |Maker    |          |                |          |                |
   |---------+---------|          |                |          |                |
   |Garcia   |Deluxe   |         4|      $46,777.50|         4|      $11,694.38|
   |         |---------+----------+----------------+----------+----------------|
   |         |Standard |        36|     $465,293.28|        35|      $12,924.81|
   |         |---------+----------+----------------+----------+----------------|
   |         |All      |        40|     $512,070.78|        39|      $12,801.77|
   |---------+---------+----------+----------------+----------+----------------|
   |Hollings-|Type of  |          |                |          |                |
   |worth    |Coffee   |          |                |          |                |
   |         |Maker    |          |                |          |                |
   |         |---------|          |                |          |                |
   |         |Deluxe   |         8|      $37,620.00|         3|       $4,702.50|
   |         |---------+----------+----------------+----------+----------------|
   |         |Standard |        24|     $309,626.10|        23|      $12,901.09|
   |         |---------+----------+----------------+----------+----------------|
   |         |All      |        32|     $347,246.10|        26|      $10,851.44|
   |---------+---------+----------+----------------+----------+----------------|
   |Jensen   |Type of  |          |                |          |                |
   |         |Coffee   |          |                |          |                |
   |         |Maker    |          |                |          |                |
   |         |---------|          |                |          |                |
   |         |Deluxe   |         4|      $40,590.00|         3|      $10,147.50|
   |         |---------+----------+----------------+----------+----------------|
   |         |Standard |        34|     $420,572.60|        32|      $12,369.78|
   |         |---------+----------+----------------+----------+----------------|
   |         |All      |        38|     $461,162.60|        35|      $12,135.86|
   |-------------------+----------+----------------+----------+----------------|
   |All                |       110|   $1,320,479.48|       100|      $12,004.36|
   -----------------------------------------------------------------------------

The numbered items in the previous SAS output correspond to the following:

[1] No heading for the variable AmountSold is displayed.

[2] The labels `Sales', `Amount', `% Sales', and `Average Sale' replace the frequency (N), SUM, COLPCTSUM, and MEAN respectively.

[3] labels replace the variables SalesRep and Type.


Using Styles and the Output Delivery System

If you use the Output Delivery System to create output from PROC TABULATE, for any destination other than Listing or Output destinations, you can do the following:

When it is used in a dimension expression, the STYLE= option must be enclosed within square brackets ([ and ])  or braces ({ and }). The PROC TABULATE step that follows creates a two-dimensional summary table that uses the STYLE= option in a CLASS statement and in the TABLE statement and that answers the question, "What is the percent of total sales and average amount sold by each sales representative of each type of coffee maker and all coffee makers?"

options linesize=84 pageno=1 nodate;

ods html file='summary-table.htm';1 
ods printer file='summary-table.ps';2 

proc tabulate data=year_sales format=comma10.;
   title1 'TruBlend Coffee Makers, Inc.';
   title2 'Sales Performance';
   class SalesRep;
   class Type / style=[font_style=italic]3 ;
   var AmountSold;
   table SalesRep='Sales Representative'*(Type='Type of Coffee Maker' 
         all*[style=[background=yellow font_weight=bold]]4 ) 
         all*[style=[font_weight=bold]]5 ,
         AmountSold=' '*(colpctsum='% Sales' mean='Average Sale'*
         f=dollar16.2);
run;

ods html close;6 
ods printer close;7 

The numbered items in the previous program correspond to the following:

[1] The ODS HTML statement opens the HTML destination and creates HTML output. FILE= identifies the file that contains the HTML output. Some browsers require an extension of HTM or HTML on the filename.

[2] The ODS PRINTER statement opens the Printer destination and creates Printer output. FILE= identifies the file that contains the Printer output.

[3] The STYLE= option is specified in the second CLASS statement, which sets the font style of the label for Type to italic. The label for SalesRep is not affected by the STYLE= option because it is in a separate CLASS statement.

[4] The universal class variable ALL is crossed with the STYLE= option, which sets the background for the table cells to yellow and the font weight for these cells to bold.

[5] The universal class variable ALL is crossed with the STYLE= option, which sets the font weight for the table cells to bold.

[6] The last ODS HTML statement closes the HTML destination and all of the files that are associated with it. You must close the HTML destination before you can view the HTML output with a browser.

[7] The last ODS PRINTER statement closes the Printer destination. You must close the Printer destination before you can print the output on a physical printer.

The following summary table displays the results:

Using Style Modifiers and the ODS HTML Statement

[Using Style Modifiers and the ODS HTML Statement]

This summary table shows the effects of the three uses of the STYLE= option with the ODS HTML statement in the previous SAS program:

The following summary table displays the results:

Using Style Modifiers and the ODS PRINTER Statement

[Using Style Modifiers and the ODS PRINTER Statement]

This summary table shows the effects of the three uses of the STYLE= option with the ODS PRINTER statement in the previous SAS program:


Ordering Class Variables

You can control the order in which class variable values and their headings display in a summary table with the ORDER= option. You can use the ORDER= option with the PROC TABULATE statement and with individual CLASS statements. The syntax is ORDER=sort-order . The four possible sort orders (DATA, FORMATTED, FREQ, and UNFORMATTED) are defined in Review of SAS Tools. The PROC TABULATE step that follows creates a two-dimensional summary table that uses the ORDER= option with the PROC TABULATE statement to order all class variables by frequency, and that answers the question, "Which quarter produced the greatest number of sales, and which sales representative made the most sales overall?"

options linesize=84 pageno=1 nodate;

proc tabulate data=year_sales format=comma10. order=freq;
   title1 'TruBlend Coffee Makers, Inc.';
   title2 'Quarterly Sales and Representative Sales by Frequency';
   class SalesRep Quarter;
   table SalesRep all,
         Quarter all;
run;

The following summary table displays the results of this program:

Ordering Class Variables

                            TruBlend Coffee Makers, Inc.                           1
               Quarterly Sales and Representative Sales by Frequency

    ----------------------------------------------------------------------------
    |                   |                  Quarter                  |          |
    |                   |-------------------------------------------|          |
    |                   |    3 1   |    1     |    2     |    4     |   All    | 3 
    |                   |----------+----------+----------+----------+----------|
    |                   |    N     |    N     |    N     |    N     |    N     |
    |-------------------+----------+----------+----------+----------+----------|
    |SalesRep           |          |          |          |          |          |
    |-------------------|          |          |          |          |          |
    |Garcia 2           |        21|         8|         6|         5|        40|
    |-------------------+----------+----------+----------+----------+----------|
    |Jensen             |        21|         5|         6|         6|        38|
    |-------------------+----------+----------+----------+----------+----------|
    |Hollingsworth      |        15|         5|         6|         6|        32|
    |-------------------+----------+----------+----------+----------+----------|
    |All 3              |        57|        18|        18|        17|       110|
    ----------------------------------------------------------------------------

The numbered items in the previous SAS output correspond to the following:

[1] The order of the values of the class variable Quarter shows that most sales occurred in quarter 3 followed by quarters 1, 2, and then 4.

[2] The order of the values of the class variable SalesRep shows that Garcia made the most sales overall, followed by Jensen and then Hollingsworth.

[3] The universal class variable ALL is included in both dimensions of this example to show the frequency data that SAS used to order the data when creating the summary table.

Previous Page | Next Page | Top of Page