Creating Summary Tables with the TABULATE Procedure |
Creating Hierarchical Tables to Report on Subgroups |
You can create a hierarchical table to report on subgroups of your data by crossing elements within a dimension. Crossing elements is the operation that combines two or more elements, such as class variables, analysis variables, format modifiers, statistics, or styles. Dimensions are automatically crossed. When you cross variables in a single dimension expression, values for one variable are placed within the values for the other variable in the same dimension. This forms a hierarchy of variables and, therefore, a hierarchical table. The order in which variables are listed when they are crossed determines the order of the headings in the table. In the column dimension, variables are stacked top to bottom; in the row dimension, left to right; and in the page dimension, front to back. You cross elements in a dimension expression by putting an asterisk between them. Note that two analysis variables cannot be crossed. Also, because dimensions are automatically crossed, all analysis variables must occur in one dimension.
The PROC TABULATE step that follows creates a two-dimensional summary table that crosses two variables and that answers the question, "What was the amount sold of each type of coffee maker by each sales representative?"
options linesize=84 pageno=1 nodate; proc tabulate data=year_sales format=comma10.; title1 'TruBlend Coffee Makers, Inc.'; title2 'Amount Sold Per Item by Each Sales Representative'; class SalesRep Type; var AmountSold; table SalesRep*Type, AmountSold; run;
The expression SalesRep*Type in the row dimension uses the asterisk operator to cross the values of the variable SalesRep with the values of the variable Type. Because SalesRep is listed before Type when crossed, and because the elements are crossed in the row dimension, values for Type will be listed to the right of values of SalesRep. Values for Type will be repeated for each value of SalesRep.
The following summary table displays the results:
TruBlend Coffee Makers, Inc. 1 Amount Sold Per Item by Each Sales Representative -------------------------------- | |AmountSold| | |----------| | | Sum | |-------------------+----------| |SalesRep |Type | | |---------+---------| | |Garcia |Deluxe | 46,778| | |---------+----------| | |Standard | 465,293| |---------+---------+----------| |Hollings-|Deluxe | 37,620| |worth |---------+----------| | |Standard | 309,626| |---------+---------+----------| |Jensen |Deluxe | 40,590| | |---------+----------| | |Standard | 420,573| --------------------------------
Notice the hierarchy of values that are created when the values for Type are repeated to the right of each value of SalesRep.
Formatting Output |
You can override formats in summary table output by crossing variables with format modifiers. You cross a variable with a format modifier by putting an asterisk between them.
The PROC TABULATE step that follows creates a two-dimensional summary table that crosses a variable with a format modifier and that answers the question, "What was the amount sold of each type of coffee maker by each sales representative?"
options linesize=84 pageno=1 nodate; proc tabulate data=year_sales format=comma10.; title1 'TruBlend Coffee Makers, Inc.'; title2 'Amount Sold Per Item by Each Sales Representative'; class SalesRep Type; var AmountSold; table SalesRep*Type, AmountSold*f=dollar16.2; run;
The expression AmountSold*f=dollar16.2 in the column dimension uses the asterisk operator to cross the values of the variable AmountSold with the SAS format modifier f=dollar16.2 . The values for AmountSold will now display using the DOLLAR16.2 format. The DOLLAR16.2 format is better suited for dollar figures than the COMMA10. format, which is specified as the default in the PROC TABULATE statement.
The following summary table displays the results:
Crossing Variables with Format Modifiers
TruBlend Coffee Makers, Inc. 1 Amount Sold Per Item by Each Sales Representative -------------------------------------- | | AmountSold | | |----------------| | | Sum | |-------------------+----------------| |SalesRep |Type | | |---------+---------| | |Garcia |Deluxe | $46,777.50| | |---------+----------------| | |Standard | $465,293.28| |---------+---------+----------------| |Hollings-|Deluxe | $37,620.00| |worth |---------+----------------| | |Standard | $309,626.10| |---------+---------+----------------| |Jensen |Deluxe | $40,590.00| | |---------+----------------| | |Standard | $420,572.60| --------------------------------------
Calculating Descriptive Statistics |
You can request descriptive statistics for a variable by crossing that variable with the appropriate statistic keyword. Crossing either a class variable or an analysis variable with a statistic tells PROC TABULATE what type of calculations to perform. Note that two statistics cannot be crossed. Also, because dimensions are automatically crossed, all statistics must occur in one dimension.
The default statistic crossed with a class variable is the N statistic or frequency. Class variables can only be crossed with frequency and percent frequency statistics. The default statistic crossed with an analysis variable is the SUM statistic. Analysis variables can be crossed with any of the many descriptive statistics that are available with PROC TABULATE including commonly used statistics like MIN, MAX, MEAN, STD, and MEDIAN. For a complete list of statistics available for use with analysis variables, see "Statistics Available in PROC TABULATE" in the Base SAS Procedures Guide.
The PROC TABULATE step that follows creates a two-dimensional summary table that crosses elements with a statistic and that answers the question, "What was the average amount per sale of each type of coffee maker by each sales representative?"
options linesize=84 pageno=1 nodate; proc tabulate data=year_sales format=comma10.; title1 'TruBlend Coffee Makers, Inc.'; title2 'Average Amount Sold Per Item by Each Sales Representative'; class SalesRep Type; var AmountSold; table SalesRep*Type, AmountSold*mean*f=dollar16.2; run;
In this program, the column dimension crosses the variable AmountSold with the statistic mean and with the format modifier f=dollar16.2 . The MEAN statistic provides the arithmetic mean for AmountSold.
The following summary table displays the results:
Crossing a Variable with a Statistic
TruBlend Coffee Makers, Inc. 1 Average Amount Sold Per Item by Each Sales Representative -------------------------------------- | | AmountSold | | |----------------| | | Mean | |-------------------+----------------| |SalesRep |Type | | |---------+---------| | |Garcia |Deluxe | $11,694.38| | |---------+----------------| | |Standard | $12,924.81| |---------+---------+----------------| |Hollings-|Deluxe | $4,702.50| |worth |---------+----------------| | |Standard | $12,901.09| |---------+---------+----------------| |Jensen |Deluxe | $10,147.50| | |---------+----------------| | |Standard | $12,369.78| --------------------------------------
Reporting on Multiple Statistics |
You can create summary tables that report on two or more statistics by concatenating variables. Concatenating is the operation that joins the information of two or more elements, such as class variables, analysis variables, or statistics, by placing the output of the second and subsequent elements immediately after the output of the first element. You concatenate elements in a dimension expression by putting a blank space between them.
The PROC TABULATE step that follows creates a two-dimensional summary table that uses concatenation and that answers the question, "How many sales were made, and what was the total sales figure for each type of coffee maker sold by each sales representative?"
options linesize=84 pageno=1 nodate; proc tabulate data=year_sales format=comma10.; title1 'TruBlend Coffee Makers, Inc.'; title2 'Sales Summary by Representative and Product'; class SalesRep Type; var AmountSold; table SalesRep*Type, AmountSold*n AmountSold*f=dollar16.2; run;
In this program, because the expressions AmountSold*n and AmountSold*f=dollar16.2 in the column dimension are separated by a blank space, their output will be concatenated.
The following summary table displays the results:
TruBlend Coffee Makers, Inc. 1 Sales Summary by Representative and Product 1 2 ------------------------------------------------- | |AmountSold| AmountSold | | |----------+----------------| | | N | Sum | |-------------------+----------+----------------| |SalesRep |Type | | | |---------+---------| | | |Garcia |Deluxe | 4| $46,777.50| | |---------+----------+----------------| | |Standard | 36| $465,293.28| |---------+---------+----------+----------------| |Hollings-|Deluxe | 8| $37,620.00| |worth |---------+----------+----------------| | |Standard | 24| $309,626.10| |---------+---------+----------+----------------| |Jensen |Deluxe | 4| $40,590.00| | |---------+----------+----------------| | |Standard | 34| $420,572.60| -------------------------------------------------
In this summary table the frequency (N) of AmountSold is shown in the same table as the SUM of AmountSold .
Reducing Code and Applying a Single Label to Multiple Elements |
You can use parentheses to group concatenated elements (variables, formats, statistics, and so on) that are concatenated or crossed with a common element. This can reduce the amount of code used and can change how labels are displayed. The PROC TABULATE step that follows uses parentheses to group elements that are crossed with AmountSold and answers the question, "How many sales were made, and what was the total sales figure for each type of coffee maker sold by each sales representative?"
options linesize=84 pageno=1 nodate; proc tabulate data=year_sales format=comma10.; title1 'TruBlend Coffee Makers, Inc.'; title2 'Sales Summary by Representative and Product'; class SalesRep Type; var AmountSold; table SalesRep*Type, AmountSold*(n sum*f=dollar16.2); run;
In this program, AmountSold*(n sum*f=dollar16.2) takes the place of AmountSold*n AmountSold*f=dollar16.2 . Notice the default statistic SUM from AmountSold*f=dollar16.2 must now be included in the expression. This is because the format modifier must be crossed with a variable or a statistic. It cannot be in the expression by itself.
The following summary table displays the results:
Using Parentheses to Group Elements
TruBlend Coffee Makers, Inc. 1 Sales Summary by Representative and Product ------------------------------------------------- | | AmountSold | | |---------------------------| | | N | Sum | |-------------------+----------+----------------| |SalesRep |Type | | | |---------+---------| | | |Garcia |Deluxe | 4| $46,777.50| | |---------+----------+----------------| | |Standard | 36| $465,293.28| |---------+---------+----------+----------------| |Hollings-|Deluxe | 8| $37,620.00| |worth |---------+----------+----------------| | |Standard | 24| $309,626.10| |---------+---------+----------+----------------| |Jensen |Deluxe | 4| $40,590.00| | |---------+----------+----------------| | |Standard | 34| $420,572.60| -------------------------------------------------
Note that the label, AmountSold, spans multiple columns rather than appearing twice in the summary table, as it does in Concatenating Variables.
Getting Summaries for All Variables |
You can summarize all of the class variables in a dimension with the universal class variable ALL. ALL can be concatenated with each of the three dimensions of the TABLE statement and within groups of elements delimited by parentheses. The PROC TABULATE step that follows creates a two-dimensional summary table with the universal class variable ALL, and answers the question, "For each sales representative and for all of the sales representatives as a group, how many sales were made, what was the average amount per sale, and what was the amount sold?"
options linesize=84 pageno=1 nodate; proc tabulate data=year_sales format=comma10.; title1 'TruBlend Coffee Makers, Inc.'; title2 'Sales Report'; class SalesRep Type; var AmountSold; table SalesRep*Type all, AmountSold*(n (mean sum)*f=dollar16.2); run;
In this program, the TABLE statement now includes the universal class variable ALL in the row dimension. SalesRep and Type will be summarized.
The following summary table displays the results:
Crossing with the Universal Class Variable ALL
TruBlend Coffee Makers, Inc. 1
Sales Report
------------------------------------------------------------------
| | AmountSold |
| |--------------------------------------------|
| | N | Mean | Sum |
|-------------------+----------+----------------+----------------|
|SalesRep |Type | | | |
|---------+---------| | | |
|Garcia |Deluxe | 4| $11,694.38| $46,777.50|
| |---------+----------+----------------+----------------|
| |Standard | 36| $12,924.81| $465,293.28|
|---------+---------+----------+----------------+----------------|
|Hollings-|Deluxe | 8| $4,702.50| $37,620.00|
|worth |---------+----------+----------------+----------------|
| |Standard | 24| $12,901.09| $309,626.10|
|---------+---------+----------+----------------+----------------|
|Jensen |Deluxe | 4| $10,147.50| $40,590.00|
| |---------+----------+----------------+----------------|
| |Standard | 34| $12,369.78| $420,572.60|
|-------------------+----------+----------------+----------------|
|All 1 | 110| $12,004.36| $1,320,479.48|
------------------------------------------------------------------
This summary table reports the frequency (N), the MEAN, and the SUM of AmountSold for each category of SalesRep and Type. This data has been summarized for all categories of SalesRep and Type in the row labeled All .
Defining Labels |
You can add your own labels to a summary table or remove headings from a summary table by assigning labels to variables in the TABLE statement. Simply follow the variable with an equal sign (=) followed by either the desired label or by a blank space in quotation marks. A blank space in quotation marks removes the heading from the summary table. The PROC TABULATE step that follows creates a two-dimensional summary table that uses labels in the TABLE statement and that answers the question, "What is the percent of total sales and average amount sold by each sales representative of each type of coffee maker and all coffee makers?"
options linesize=84 pageno=1 nodate; proc tabulate data=year_sales format=comma10.; title1 'TruBlend Coffee Makers, Inc.'; title2 'Sales Performance'; class SalesRep Type; var AmountSold; table SalesRep='Sales Representative'1 * (Type='Type of Coffee Maker'1 all) all, AmountSold=' '4 * (N='Sales'2 SUM='Amount'2 *f=dollar16.2 colpctsum='% Sales'3 mean='Average Sale'2 *f=dollar16.2); run;
The numbered items in the previous program correspond to the following:
The following summary table displays the results:
Using Labels to Customize Summary Tables
TruBlend Coffee Makers, Inc. 1 Sales Performance 1 ---------------------------------------------------------------------------- | 2 | Sales | Amount | % Sales | Averag Sale | |-------------------+----------+----------------+----------+----------------| 3 |Sales |Type of | | | | | |Represen-|Coffee | | | | | |tative |Maker | | | | | |---------+---------| | | | | |Garcia |Deluxe | 4| $46,777.50| 4| $11,694.38| | |---------+----------+----------------+----------+----------------| | |Standard | 36| $465,293.28| 35| $12,924.81| | |---------+----------+----------------+----------+----------------| | |All | 40| $512,070.78| 39| $12,801.77| |---------+---------+----------+----------------+----------+----------------| |Hollings-|Type of | | | | | |worth |Coffee | | | | | | |Maker | | | | | | |---------| | | | | | |Deluxe | 8| $37,620.00| 3| $4,702.50| | |---------+----------+----------------+----------+----------------| | |Standard | 24| $309,626.10| 23| $12,901.09| | |---------+----------+----------------+----------+----------------| | |All | 32| $347,246.10| 26| $10,851.44| |---------+---------+----------+----------------+----------+----------------| |Jensen |Type of | | | | | | |Coffee | | | | | | |Maker | | | | | | |---------| | | | | | |Deluxe | 4| $40,590.00| 3| $10,147.50| | |---------+----------+----------------+----------+----------------| | |Standard | 34| $420,572.60| 32| $12,369.78| | |---------+----------+----------------+----------+----------------| | |All | 38| $461,162.60| 35| $12,135.86| |-------------------+----------+----------------+----------+----------------| |All | 110| $1,320,479.48| 100| $12,004.36| -----------------------------------------------------------------------------
The numbered items in the previous SAS output correspond to the following:
The labels `Sales', `Amount', `% Sales', and `Average Sale' replace the frequency (N), SUM, COLPCTSUM, and MEAN respectively. | |
Using Styles and the Output Delivery System |
If you use the Output Delivery System to create output from PROC TABULATE, for any destination other than Listing or Output destinations, you can do the following:
Set certain style elements (such as font style, font weight, and color) that the procedure uses for various parts of the table.
Specify style elements for the labels for variables by adding the option to the CLASS statement.
Specify style elements for cells in the summary table by crossing the STYLE= option with an element of a dimension expression.
When it is used in a dimension expression, the STYLE= option must be enclosed within square brackets ([ and ]) or braces ({ and }). The PROC TABULATE step that follows creates a two-dimensional summary table that uses the STYLE= option in a CLASS statement and in the TABLE statement and that answers the question, "What is the percent of total sales and average amount sold by each sales representative of each type of coffee maker and all coffee makers?"
options linesize=84 pageno=1 nodate;
ods html file='summary-table.htm';1 ods printer file='summary-table.ps';2
proc tabulate data=year_sales format=comma10.; title1 'TruBlend Coffee Makers, Inc.'; title2 'Sales Performance'; class SalesRep; class Type / style=[font_style=italic]3 ; var AmountSold; table SalesRep='Sales Representative'*(Type='Type of Coffee Maker' all*[style=[background=yellow font_weight=bold]]4 ) all*[style=[font_weight=bold]]5 , AmountSold=' '*(colpctsum='% Sales' mean='Average Sale'* f=dollar16.2); run; ods html close;6 ods printer close;7
The numbered items in the previous program correspond to the following:
The following summary table displays the results:
Using Style Modifiers and the ODS HTML Statement
This summary table shows the effects of the three uses of the STYLE= option with the ODS HTML statement in the previous SAS program:
The subtotals for each value of sales representative are highlighted in a lighter color (yellow) and are bold.
The following summary table displays the results:
Using Style Modifiers and the ODS PRINTER Statement
This summary table shows the effects of the three uses of the STYLE= option with the ODS PRINTER statement in the previous SAS program:
Ordering Class Variables |
You can control the order in which class variable values and their headings display in a summary table with the ORDER= option. You can use the ORDER= option with the PROC TABULATE statement and with individual CLASS statements. The syntax is ORDER=sort-order . The four possible sort orders (DATA, FORMATTED, FREQ, and UNFORMATTED) are defined in Review of SAS Tools. The PROC TABULATE step that follows creates a two-dimensional summary table that uses the ORDER= option with the PROC TABULATE statement to order all class variables by frequency, and that answers the question, "Which quarter produced the greatest number of sales, and which sales representative made the most sales overall?"
options linesize=84 pageno=1 nodate; proc tabulate data=year_sales format=comma10. order=freq; title1 'TruBlend Coffee Makers, Inc.'; title2 'Quarterly Sales and Representative Sales by Frequency'; class SalesRep Quarter; table SalesRep all, Quarter all; run;
The following summary table displays the results of this program:
TruBlend Coffee Makers, Inc. 1 Quarterly Sales and Representative Sales by Frequency ---------------------------------------------------------------------------- | | Quarter | | | |-------------------------------------------| | | | 3 1 | 1 | 2 | 4 | All | 3 | |----------+----------+----------+----------+----------| | | N | N | N | N | N | |-------------------+----------+----------+----------+----------+----------| |SalesRep | | | | | | |-------------------| | | | | | |Garcia 2 | 21| 8| 6| 5| 40| |-------------------+----------+----------+----------+----------+----------| |Jensen | 21| 5| 6| 6| 38| |-------------------+----------+----------+----------+----------+----------| |Hollingsworth | 15| 5| 6| 6| 32| |-------------------+----------+----------+----------+----------+----------| |All 3 | 57| 18| 18| 17| 110| ----------------------------------------------------------------------------
The numbered items in the previous SAS output correspond to the following:
Copyright © 2012 by SAS Institute Inc., Cary, NC, USA. All rights reserved.