TABULATE Procedure

Example 4: Using Multilabel Formats

Features:

CLASS statement options: MLF

PROC TABULATE statement options: :FORMAT=

TABLE statement:
ALL class variable
concatenation (blank) operator
crossing (*) operator
grouping elements (parentheses) operator
label
variable list
Other features:

FORMAT procedure

FORMAT statement

VALUE statement options: MULTILABEL

Data set: CARSURVEY

Details

This example does the following:
  • shows how to specify a multilabel format in the VALUE statement of PROC FORMAT
  • shows how to activate multilabel format processing using the MLF option with the CLASS statement
  • demonstrates the behavior of the N statistic when multilabel format processing is activated

Program

data carsurvey;
   input Rater Age Progressa Remark Jupiter Dynamo;
   datalines;
1   38  94  98  84  80
2   49  96  84  80  77
3   16  64  78  76  73
4   27  89  73  90  92

 ... more data lines ...

77   61  92  88  77  85
78   24  87  88  88  91
79   18  54  50  62  74
80   62  90  91  90  86
;
proc format;
   value agefmt (multilabel notsorted)
         15 - 29 = 'Below 30 years'
         30 - 50 = 'Between 30 and 50'
       51 - high = 'Over 50 years'
         15 - 19 = '15 to 19'
         20 - 25 = '20 to 25'
         25 - 39 = '25 to 39'
         40 - 55 = '40 to 55'
       56 - high = '56 and above';
run;
proc tabulate data=carsurvey format=10.;
   class age / mlf;
   var progressa remark jupiter dynamo;
   table age all, n all='Potential Car Names'*(progressa remark
      jupiter dynamo)*mean;
   
   title1 "Rating Four Potential Car Names";
   title2 "Rating Scale 0-100 (100 is the highest rating)";
   format age agefmt.;
run;

Program Description

Create the CARSURVEY data set.CARSURVEY contains data from a survey that was distributed by a car manufacturer to a focus group of potential customers who were brought together to evaluate new car names. Each observation in the data set contains an identification number, the participant's age, and the participant's ratings of four car names. A DATA step creates the data set.
data carsurvey;
   input Rater Age Progressa Remark Jupiter Dynamo;
   datalines;
1   38  94  98  84  80
2   49  96  84  80  77
3   16  64  78  76  73
4   27  89  73  90  92

 ... more data lines ...

77   61  92  88  77  85
78   24  87  88  88  91
79   18  54  50  62  74
80   62  90  91  90  86
;
Create the AGEFMT. format.The FORMAT procedure creates a multilabel format for ages by using the MULTILABEL. A multilabel format is one in which multiple labels can be assigned to the same value, in this case because of overlapping ranges. Each value is represented in the table for each range in which it occurs. The NOTSORTED option stores the ranges in the order in which they are defined.
proc format;
   value agefmt (multilabel notsorted)
         15 - 29 = 'Below 30 years'
         30 - 50 = 'Between 30 and 50'
       51 - high = 'Over 50 years'
         15 - 19 = '15 to 19'
         20 - 25 = '20 to 25'
         25 - 39 = '25 to 39'
         40 - 55 = '40 to 55'
       56 - high = '56 and above';
run;
Specify the table options.The FORMAT= option specifies up to 10 digits as the default format for the value in each table cell.
proc tabulate data=carsurvey format=10.;
Specify subgroups for the analysis.The CLASS statement identifies Age as the class variable and uses the MLF option to activate multilabel format processing.
   class age / mlf;
Specify the analysis variables.The VAR statement specifies that PROC TABULATE calculate statistics on the Progressa, Remark, Jupiter, and Dynamo variables.
   var progressa remark jupiter dynamo;
Define the table rows and columns.The row dimension of the TABLE statement creates a row for each formatted value of Age. Multilabel formatting allows an observation to be included in multiple rows or age categories. The row dimension uses the ALL class variable to summarize information for all rows. The column dimension uses the N statistic to calculate the number of observations for each age group. Notice that the result of the N statistic crossed with the ALL class variable in the row dimension is the total number of observations instead of the sum of the N statistics for the rows. The column dimension uses the ALL class variable at the beginning of a crossing to assign a label, Potential Car Names. The four nested columns calculate the mean ratings of the car names for each age group.
   table age all, n all='Potential Car Names'*(progressa remark
      jupiter dynamo)*mean;
   
Specify the titles.
   title1 "Rating Four Potential Car Names";
   title2 "Rating Scale 0-100 (100 is the highest rating)";
Format the output.The FORMAT statement assigns the user-defined format AGEFMT. to Age for this analysis.
   format age agefmt.;
run;

Output

Rating Four Potential Car Names