Previous Page | Next Page

The TABULATE Procedure

Example 13: Using Denominator Definitions to Display Basic Frequency Counts and Percentages


Procedure features:

TABLE statement:

ALL class variable

denominator definitions (angle bracket operators)

N statistic

PCTN statistic

Other features:

FORMAT procedure


Crosstabulation tables (also called contingency tables and stub-and-banner reports) show combined frequency distributions for two or more variables. This table shows frequency counts for females and males within each of four job classes. The table also shows the percentage that each frequency count represents of


Program

 Note about code
options nodate pageno=1 linesize=80 pagesize=60;
 Note about code
data jobclass;
   input Gender Occupation @@;
   datalines;
1 1  1 1  1 1  1 1  1 1  1 1  1 1
1 2  1 2  1 2  1 2  1 2  1 2  1 2
1 3  1 3  1 3  1 3  1 3  1 3  1 3
1 1  1 1  1 1  1 2  1 2  1 2  1 2
1 2  1 2  1 3  1 3  1 4  1 4  1 4
1 4  1 4  1 4  1 1  1 1  1 1  1 1
1 1  1 2  1 2  1 2  1 2  1 2  1 2
1 2  1 3  1 3  1 3  1 3  1 4  1 4
1 4  1 4  1 4  1 1  1 3  2 1  2 1
2 1  2 1  2 1  2 1  2 1  2 2  2 2
2 2  2 2  2 2  2 3  2 3  2 3  2 4
2 4  2 4  2 4  2 4  2 4  2 1  2 3
2 3  2 3  2 3  2 3  2 4  2 4  2 4
2 4  2 4  2 1  2 1  2 1  2 1  2 1
2 2  2 2  2 2  2 2  2 2  2 2  2 2
2 3  2 3  2 4  2 4  2 4  2 1  2 1
2 1  2 1  2 1  2 2  2 2  2 2  2 3
2 3  2 3  2 3  2 4
;
 Note about code
proc format;
   value gendfmt 1='Female'
                 2='Male'
             other='*** Data Entry Error ***';
   value occupfmt 1='Technical'
                  2='Manager/Supervisor'
                  3='Clerical'
                  4='Administrative'
              other='*** Data Entry Error ***';
run;
 Note about code
proc tabulate data=jobclass format=8.2;
 Note about code
   class gender occupation;
 Note about code
    table (occupation='Job Class' all='All Jobs')
              *(n='Number of employees'*f=9.
              pctn<gender all>='Percent of row total'
              pctn<occupation all>='Percent of column total'
              pctn='Percent of total'),
 Note about code
    gender='Gender' all='All Employees'/ rts=50;
 Note about code
   format gender gendfmt. occupation occupfmt.;
 Note about code
   title 'Gender Distribution';
   title2 'within Job Classes';
run;

Output

                              Gender Distribution                              1
                               within Job Classes

--------------------------------------------------------------------------------
|                                                |      Gender       |         |
|                                                |-------------------|   All   |
|                                                | Female  |  Male   |Employees|
|------------------------------------------------+---------+---------+---------|
|Job Class              |                        |         |         |         |
|-----------------------+------------------------|         |         |         |
|Technical              |Number of employees     |       16|       18|       34|
|                       |------------------------+---------+---------+---------|
|                       |Percent of row total    |    47.06|    52.94|   100.00|
|                       |------------------------+---------+---------+---------|
|                       |Percent of column total |    26.23|    29.03|    27.64|
|                       |------------------------+---------+---------+---------|
|                       |Percent of total        |    13.01|    14.63|    27.64|
|-----------------------+------------------------+---------+---------+---------|
|Manager/Supervisor     |Number of employees     |       20|       15|       35|
|                       |------------------------+---------+---------+---------|
|                       |Percent of row total    |    57.14|    42.86|   100.00|
|                       |------------------------+---------+---------+---------|
|                       |Percent of column total |    32.79|    24.19|    28.46|
|                       |------------------------+---------+---------+---------|
|                       |Percent of total        |    16.26|    12.20|    28.46|
|-----------------------+------------------------+---------+---------+---------|
|Clerical               |Number of employees     |       14|       14|       28|
|                       |------------------------+---------+---------+---------|
|                       |Percent of row total    |    50.00|    50.00|   100.00|
|                       |------------------------+---------+---------+---------|
|                       |Percent of column total |    22.95|    22.58|    22.76|
|                       |------------------------+---------+---------+---------|
|                       |Percent of total        |    11.38|    11.38|    22.76|
|-----------------------+------------------------+---------+---------+---------|
|Administrative         |Number of employees     |       11|       15|       26|
|                       |------------------------+---------+---------+---------|
|                       |Percent of row total    |    42.31|    57.69|   100.00|
|                       |------------------------+---------+---------+---------|
|                       |Percent of column total |    18.03|    24.19|    21.14|
|                       |------------------------+---------+---------+---------|
|                       |Percent of total        |     8.94|    12.20|    21.14|
|-----------------------+------------------------+---------+---------+---------|
|All Jobs               |Number of employees     |       61|       62|      123|
|                       |------------------------+---------+---------+---------|
|                       |Percent of row total    |    49.59|    50.41|   100.00|
|                       |------------------------+---------+---------+---------|
|                       |Percent of column total |   100.00|   100.00|   100.00|
|                       |------------------------+---------+---------+---------|
|                       |Percent of total        |    49.59|    50.41|   100.00|
--------------------------------------------------------------------------------

A Closer Look

The part of the TABLE statement that defines the rows of the table uses the PCTN statistic to calculate three different percentages.

In all calculations of PCTN, the numerator is N, the frequency count for one cell of the table. The denominator for each occurrence of PCTN is determined by the denominator definition. The denominator definition appears in angle brackets after the keyword PCTN. It is a list of one or more expressions. The list tells PROC TABULATE which frequency counts to sum for the denominator.


Analyzing the Structure of the Table

Taking a close look at the structure of the table helps you understand how PROC TABULATE uses the denominator definitions. The following simplified version of the TABLE statement clarifies the basic structure of the table:

table occupation='Job Class' all='All Jobs',
      gender='Gender' all='All Employees';

The table is a concatenation of four subtables. In this report, each subtable is a crossing of one class variable in the row dimension and one class variable in the column dimension. Each crossing establishes one or more categories. A category is a combination of unique values of class variables, such as female, technical or all, clerical . The following table describes each subtable.

Contents of Subtables
Class variables contributing to the subtable Description of frequency counts Number of categories
Occupation and Gender number of females in each job or number of males in each job 8
All and Gender number of females or number of males 2
Occupation and All number of people in each job 4
All and All number of people in all jobs 1

The following figure highlights these subtables and the frequency counts for each category.

Illustration of the Four Subtables

[Illustration of the Four Subtables]


Interpreting Denominator Definitions

The following fragment of the TABLE statement defines the denominator definitions for this report. The PCTN keyword and the denominator definitions are highlighted.

 table (occupation='Job Class' all='All Jobs')
            *(n='Number of employees'*f=5.
              pctn<gender all>='Row percent'
              pctn<occupation all>='Column percent'
              pctn='Percent of total'),

Each use of PCTN nests a row of statistics within each value of Occupation and All. Each denominator definition tells PROC TABULATE which frequency counts to sum for the denominators in that row. This section explains how PROC TABULATE interprets these denominator definitions.


Row Percentages

The part of the TABLE statement that calculates the row percentages and that labels the row is

   pctn<gender all>='Row percent'

Consider how PROC TABULATE interprets this denominator definition for each subtable.

Subtable 1: Occupation and Gender

[Subtable 1: Occupation and Gender]

PROC TABULATE looks at the first element in the denominator definition, Gender, and asks if Gender contributes to the subtable. Because Gender does contribute to the subtable, PROC TABULATE uses it as the denominator definition. This denominator definition tells PROC TABULATE to sum the frequency counts for all occurrences of Gender within the same value of Occupation.

For example, the denominator for the category female, technical is the sum of all frequency counts for all categories in this subtable for which the value of Occupation is technical . There are two such categories: female, technical and male, technical . The corresponding frequency counts are 16 and 18. Therefore, the denominator for this category is 16+18, or 34.

Subtable 2: All and Gender

[Subtable 2: All and Gender]

PROC TABULATE looks at the first element in the denominator definition, Gender, and asks if Gender contributes to the subtable. Because Gender does contribute to the subtable, PROC TABULATE uses it as the denominator definition. This denominator definition tells PROC TABULATE to sum the frequency counts for all occurrences of Gender in the subtable.

For example, the denominator for the category all, female is the sum of the frequency counts for all, female and all, male . The corresponding frequency counts are 61 and 62. Therefore, the denominator for cells in this subtable is 61+62, or 123.

Subtable 3: Occupation and All

[Subtable 3: Occupation and All]

PROC TABULATE looks at the first element in the denominator definition, Gender, and asks if Gender contributes to the subtable. Because Gender does not contribute to the subtable, PROC TABULATE looks at the next element in the denominator definition, which is All. The variable All does contribute to this subtable, so PROC TABULATE uses it as the denominator definition. All is a reserved class variable with only one category. Therefore, this denominator definition tells PROC TABULATE to use the frequency count of All as the denominator.

For example, the denominator for the category clerical, all is the frequency count for that category, 28.

Note:   In these table cells, because the numerator and the denominator are the same, the row percentages in this subtable are all 100.  [cautionend]

Subtable 4: All and All

[Subtable 4: All and All]

PROC TABULATE looks at the first element in the denominator definition, Gender, and asks if Gender contributes to the subtable. Because Gender does not contribute to the subtable, PROC TABULATE looks at the next element in the denominator definition, which is All. The variable All does contribute to this subtable, so PROC TABULATE uses it as the denominator definition. All is a reserved class variable with only one category. Therefore, this denominator definition tells PROC TABULATE to use the frequency count of All as the denominator.

There is only one category in this subtable: all, all . The denominator for this category is 123.

Note:   In this table cell, because the numerator and denominator are the same, the row percentage in this subtable is 100.  [cautionend]


Column Percentages

The part of the TABLE statement that calculates the column percentages and labels the row is

   pctn<occupation all>='Column percent'

Consider how PROC TABULATE interprets this denominator definition for each subtable.

Subtable 1: Occupation and Gender

[Subtable 1: Occupation and Gender]

PROC TABULATE looks at the first element in the denominator definition, Occupation, and asks if Occupation contributes to the subtable. Because Occupation does contribute to the subtable, PROC TABULATE uses it as the denominator definition. This denominator definition tells PROC TABULATE to sum the frequency counts for all occurrences of Occupation within the same value of Gender.

For example, the denominator for the category manager/supervisor, male is the sum of all frequency counts for all categories in this subtable for which the value of Gender is male . There are four such categories: technical, male ; manager/supervisor, male ; clerical, male ; and administrative, male . The corresponding frequency counts are 18, 15, 14, and 15. Therefore, the denominator for this category is 18+15+14+15, or 62.

Subtable 2: All and Gender

[Subtable 2: All and Gender]

PROC TABULATE looks at the first element in the denominator definition, Occupation, and asks if Occupation contributes to the subtable. Because Occupation does not contribute to the subtable, PROC TABULATE looks at the next element in the denominator definition, which is All. Because the variable All does contribute to this subtable, PROC TABULATE uses it as the denominator definition. All is a reserved class variable with only one category. Therefore, this denominator definition tells PROC TABULATE to use the frequency count for All as the denominator.

For example, the denominator for the category all, female is the frequency count for that category, 61.

Note:   In these table cells, because the numerator and denominator are the same, the column percentages in this subtable are all 100.  [cautionend]

Subtable 3: Occupation and All

[Subtable 3: Occupation and All]

PROC TABULATE looks at the first element in the denominator definition, Occupation, and asks if Occupation contributes to the subtable. Because Occupation does contribute to the subtable, PROC TABULATE uses it as the denominator definition. This denominator definition tells PROC TABULATE to sum the frequency counts for all occurrences of Occupation in the subtable.

For example, the denominator for the category technical, all is the sum of the frequency counts for technical, all ; manager/supervisor, all ; clerical, all ; and administrative, all . The corresponding frequency counts are 34, 35, 28, and 26. Therefore, the denominator for this category is 34+35+28+26, or 123.

Subtable 4: All and All

[Subtable 4: All and All]

PROC TABULATE looks at the first element in the denominator definition, Occupation, and asks if Occupation contributes to the subtable. Because Occupation does not contribute to the subtable, PROC TABULATE looks at the next element in the denominator definition, which is All. Because the variable All does contribute to this subtable, PROC TABULATE uses it as the denominator definition. All is a reserved class variable with only one category. Therefore, this denominator definition tells PROC TABULATE to use the frequency count of All as the denominator.

There is only one category in this subtable: all, all . The frequency count for this category is 123.

Note:   In this calculation, because the numerator and denominator are the same, the column percentage in this subtable is 100.  [cautionend]


Total Percentages

The part of the TABLE statement that calculates the total percentages and labels the row is

   pctn='Total percent'

If you do not specify a denominator definition, then PROC TABULATE obtains the denominator for a cell by totaling all the frequency counts in the subtable. The following table summarizes the process for all subtables in this example.

Denominators for Total Percentages
Class variables contributing to the subtable Frequency counts Total
Occupat and Gender 16, 18, 20, 15 14, 14, 11, 15 123
Occupat and All 34, 35, 28, 26 123
Gender and All 61, 62 123
All and All 123 123

Consequently, the denominator for total percentages is always 123.

Previous Page | Next Page | Top of Page