Previous Page | Next Page

The TABULATE Procedure

Example 11: Reporting on Multiple-Choice Survey Data


Procedure features:

TABLE statement:

N statistic

Other features:

FORMAT procedure

TRANSPOSE procedure

Data set options:

RENAME=


This report of listener preferences shows how many listeners select each type of programming during each of seven time periods on a typical weekday. The data was collected by a survey, and the results were stored in a SAS data set. Although this data set contains all the information needed for this report, the information is not arranged in a way that PROC TABULATE can use.

To make this crosstabulation of time of day and choice of radio programming, you must have a data set that contains a variable for time of day and a variable for programming preference. PROC TRANSPOSE reshapes the data into a new data set that contains these variables. Once the data are in the appropriate form, PROC TABULATE creates the report.


Collecting the Data

The following figure shows the survey form that is used to collect data.

Completed Survey Form

[Completed Survey Form]

An external file contains the raw data for the survey. Several lines from that file appear here.

967 32 f 5 3 5
7 5 5 5 7 0 0 0 8 7 0 0 8 0
781 30 f 2 3 5
5 0 0 0 5 0 0 0 4 7 5 0 0 0
859 39 f 1 0 5
1 0 0 0 1 0 0 0 0 0 0 0 0 0

. . . more data lines . . .

859 32 m .25 .25 1
1 0 0 0 0 0 0 0 1 0 0 0 0 0


Program

 Note about code
options nodate pageno=1 linesize=132 pagesize=40;
 Note about code
data radio;
   infile 'input-file' missover;
 Note about code
   input /(Time1-Time7) ($1. +1);
   listener=_n_;
run;
 Note about code
proc format;
   value $timefmt 'Time1'='6-9 a.m.'
                  'Time2'='9 a.m. to noon'
                  'Time3'='noon to 1 p.m.'
                  'Time4'='1-4 p.m.'
                  'Time5'='4-6 p.m.'
                  'Time6'='6-10 p.m.'
                  'Time7'='10 p.m. to 2 a.m.'
                    other='*** Data Entry Error ***';
   value $pgmfmt      '0'="Don't Listen"
                  '1','2'='Rock and Top 40'
                      '3'='Country'
              '4','5','6'='Jazz, Classical, and Easy Listening'
                      '7'='News/ Information /Talk'
                      '8'='Other'
                    other='*** Data Entry Error ***';
run;
 Note about code
proc transpose data=radio
               out=radio_transposed(rename=(col1=Choice))
               name=Timespan;
   by listener;
   var time1-time7;
 Note about code
   format timespan $timefmt. choice $pgmfmt.;
run;
 Note about code
proc tabulate data=radio_transposed format=12.;
 Note about code
   class timespan choice;
 Note about code
   table timespan='Time of Day',
         choice='Choice of Radio Program'*n='Number of Listeners';
 Note about code
   title 'Listening Preferences on Weekdays';
run;

Output

                                                 Listening Preferences on Weekdays                                                 1

          ---------------------------------------------------------------------------------------------------------------
          |                               |                           Choice of Radio Program                           |
          |                               |-----------------------------------------------------------------------------|
          |                               |            |            |            |   Jazz,    |            |            |
          |                               |            |            |            | Classical, |   News/    |            |
          |                               |            |Rock and Top|            |  and Easy  |Information |            |
          |                               |Don't Listen|     40     |  Country   | Listening  |   /Talk    |   Other    |
          |                               |------------+------------+------------+------------+------------+------------|
          |                               | Number of  | Number of  | Number of  | Number of  | Number of  | Number of  |
          |                               | Listeners  | Listeners  | Listeners  | Listeners  | Listeners  | Listeners  |
          |-------------------------------+------------+------------+------------+------------+------------+------------|
          |Time of Day                    |            |            |            |            |            |            |
          |-------------------------------|            |            |            |            |            |            |
          |6-9 a.m.                       |          34|         143|           7|          39|          96|          17|
          |-------------------------------+------------+------------+------------+------------+------------+------------|
          |9 a.m. to noon                 |         214|          59|           5|          51|           3|           4|
          |-------------------------------+------------+------------+------------+------------+------------+------------|
          |noon to 1 p.m.                 |         238|          55|           3|          27|           9|           4|
          |-------------------------------+------------+------------+------------+------------+------------+------------|
          |1-4 p.m.                       |         216|          60|           5|          50|           2|           3|
          |-------------------------------+------------+------------+------------+------------+------------+------------|
          |4-6 p.m.                       |          56|         130|           6|          57|          69|          18|
          |-------------------------------+------------+------------+------------+------------+------------+------------|
          |6-10 p.m.                      |         202|          54|           9|          44|          20|           7|
          |-------------------------------+------------+------------+------------+------------+------------+------------|
          |10 p.m. to 2 a.m.              |         264|          29|           3|          36|           2|           2|
          ---------------------------------------------------------------------------------------------------------------

A Closer Look


Reshape the data

The original input data set has all the information that you need to make the crosstabular report, but PROC TABULATE cannot use the information in that form. PROC TRANSPOSE rearranges the data so that each observation in the new data set contains the variable Listener, a variable for time of day, and a variable for programming preference. The following figure illustrates the transposition. PROC TABULATE uses this new data set to create the crosstabular report.

PROC TRANSPOSE restructures data so that values that were stored in one observation are written to one variable. You can specify which variables you want to transpose. This section illustrates how PROC TRANSPOSE reshapes the data. The following section explains the PROC TRANSPOSE step in this example.

When you transpose with BY processing, as this example does, you create from each BY group one observation for each variable that you transpose. In this example, Listener is the BY variable. Each observation in the input data set is a BY group because the value of Listener is unique for each observation.

This example transposes seven variables, Time1 through Time7. Therefore, the output data set has seven observations from each BY group (each observation) in the input data set.

Transposing Two Observations

[Transposing Two Observations]


Understanding the PROC TRANSPOSE Step

Here is a detailed explanation of the PROC TRANSPOSE step that reshapes the data:

proc transpose data=radio  1  
               out=radio_transposed(rename=(col1=Choice))   2  
               name=Timespan;   3  
   by listener;   4  
   var time1-time7;    5 
   format timespan $timefmt. choice $pgmfmt.;   6  
run;

[1] The DATA= option specifies the input data set.

[2] The OUT= option specifies the output data set. The RENAME= data set option renames the transposed variable from COL1 (the default name) to Choice.

[3] The NAME= option specifies the name for the variable in the output data set that contains the name of the variable that is being transposed to create the current observation. By default, the name of this variable is _NAME_.

[4] The BY statement identifies Listener as the BY variable.

[5] The VAR statement identifies Time1 through Time7 as the variables to transpose.

[6] The FORMAT statement assigns formats to Timespan and Choice. The PROC TABULATE step that creates the report does not need to format Timespan and Choice because the formats are stored with these variables.

Previous Page | Next Page | Top of Page