Frequency of Eye Color By Hair Color Crosstabulation

Overview

Use the Frequency transformations to produce one-way to n-way frequency and contingency (crosstabulation) tables. The Frequency transformations are based on the FREQ procedure, which generates frequency statistics. For more information about this procedure, see "The FREQ Procedure" section in Base SAS Procedures Guide.
There are two Frequency transformations: Frequency and One-Way Frequency. The Frequency transformation uses PROC FREQ to compute statistics for complex tests, measures of association, and stratified analysis of one-way to n-way tables. The One-Way Frequency transformation is used for simpler PROC FREQ analysis on one-way tables to examine the relationship between two classification variables. It can also be used to compute statistics for equal proportions, specified proportions, or the binomial proportion. The One-Way Frequency transformation also has a subset of the options available for the Frequency transformation.
Both Frequency transformations control many aspects of the analysis, including the following:
  • grouping of rows by the values in one or more columns
  • how the rows appear in the report
  • which column or columns are analyzed
You can use the Frequency transformations to generate frequency statistics in a target and on the Output tab of the Job Editor. ODS output in the form of HTML, PDF, or RTF can be sent to a folder on the SAS Application Server that executes the job. ODS output can also be sent to any folder with access to that SAS Application Server.
The target receives data only for the source columns that are involved in the analysis. The target requires two columns that either Frequency transformation populates: Count receives the total number of occurrences in a category, and Percent receives the percentages for each category.
You can specify grouping columns in the Frequency transformations. When you do this, a SAS BY statement orders target rows according to the values in the grouping columns. The Frequency transformations require that grouping columns be sorted in ascending order in the source. If you specify grouping columns, you can sort those columns before the Frequency transformation using a SAS Sort transformation.
For examples of how you can use the Frequency transformations, see the Frequency of Eye Color By Hair Color Crosstabulation at Frequency of Eye Color By Hair Color Crosstabulation and the One-Way Frequency transformation example at One-Way Frequency of Eye Color By Region.

Problem

You want to generate frequency statistics.

Solution

You can use the Frequency transformation in a SAS Data Integration Studio job to produce one-way to n-way frequency and contingency (crosstabulation) tables. For example, you can create a job similar to the sample job featured in this topic. This sample job generates a list of the numbers of individuals with particular combinations of hair and eye color by geographical region. The frequency statistics are sent to a target and to the Output tab in the Job Editor window. The sample job includes the following tasks:

Tasks

Create and Populate the Job

Perform the following steps to create and populate the job:
  1. Create an empty SAS Data Integration Studio job.
  2. From the Analysis folder in the Transformations tree, select and drag a Frequency transformation and drop it in the empty job on the Diagram tab in the Job Editor window.
  3. Select and drag the source table from its folder and drop it before the Frequency transformation on the Diagram tab.
  4. Drag the cursor from the source table to the input port of the Frequency transformation. This action connects the transformation to the source.
  5. Right-click the Frequency transformation, and click Add Output Port from the Ports option in the drop-down menu. This step enables you to add an output port to the transformation.
  6. Select and drag the source table from the Inventory tree. Then, drop it after the Frequency transformation on the Diagram tab.
  7. Drag the cursor from the Frequency transformation output port to the target table. This action connects the target to the transformation.
The following display shows a sample process flow diagram for a job that contains the Frequency transformation:
Sample Process Flow
Sample Process Flow
Note that the source table for the sample job is named COLOR, and the target table is named COLOROUT.

Configure Analytical Options

Use the Options tab in the properties window for the Frequency transformation to configure the output for your analysis. Note that the Options tab is divided into two parts, with a list of categories on the left side and the options for the selected category on the right side.
Perform the following steps to set the options that you need for your job:
  1. In the Mappings tab, add the column Eye Color to the target table.
  2. In the Diagram tab of the Job Editor window, open the properties window for the Frequency transformation. Then, click the Options tab.
  3. Click Assign columns to access the Assign columns page. Use the column selection prompts to access the columns that you need for your job. For example, you can click Column Selection beside the Select columns for frequency distribution field to access the Select Data Source Items window, as shown in the following display:
    Sample Select Data Source Items Window
    Sample Select Data Source Items Window
    In the sample job, the following column options are set in the Assign columns window:
    • In the Select columns for frequency distribution field, select the values of Eye Color and Hair Color.
    • To create a crosstabulation table, enter the value of Eyes Hair Eyes*Hair in the Select frequency distribution tables field. The Eyes*Hair specification produces a crosstabulation table with eye color defining the table rows and hair color defining the table columns.
      Note: Any entry in the Select frequency distribution tables field overrides the values in the Select columns for frequency distribution field.
    • In the Select column that represents the frequency of observation (WEIGHT statement) field, select Count.
    These fields are shown in the following display:
    Frequency Column Options
    Frequency Column Options
  4. Set the Cell statistics to include in the output. In this example, the CHISQ option is used to produce chi-square tests. The selected cell statistics include the EXPECTED option, which displays expected cell frequencies in the table, and the CELLCHI2 option, which displays the cell contribution to the chi-square. The NOROW and NOCOL options suppress the display of row and column percentages in the table. These items are selected as shown in the following display:
    Cell statistics
    Cell statistics
  5. Set the options for the Table statistics in the appropriate fields. For this example, the settings for Perform Chi-square tests (TABLES CHISQ) and Order by values (ORDER) are set in the windows as shown in the following displays:
    Table Statistics Example
    Table Statistics Example
    Computation Options
    Computation Options
  6. Set the options for your analysis in the appropriate fields. Note that these frequency options are set for the sample job in the Specify other options window:
    • Enter a value of ORDER=FREQ in the Specify other options for PROC FREQ statement field.
    • Enter a value of ChiSqData pchi lrchi n nmiss in the Specify other options for OUTPUT statement field. The OUTPUT statement creates the ChiSqData data set with eight variables: the N option stores the number of non-missing observations; the NMISS option stores the number of missing observations; and the PCHI and LRCHI options store Pearson and likelihood-ratio chi-square statistics, respectively, together with their degrees of freedom and p-values.
    • Select a value of Yes in the Display the "Number of Variables Levels" table (NLEVELS) field.
    These fields are shown in the following display:
    Frequency Options
    Frequency Options
Note: In the sample job, the COLOR source table is already sorted in ascending order according to the values of the Geographical Region column. The Frequency transformation requires sorting by grouping columns. If COLOR is not sorted appropriately, then a SAS Sort transformation can be added to the job before the Frequency transformation.

Configure Reporting Options

Use the remaining option pages to create and save a report based on the analysis conducted in the job. Perform the following steps to set the reporting options:
  1. Click Titles and footnotes to access the Titles and footnotes page and enter up to three headings and two footnotes.
  2. Click ODS options to access the ODS options page. You can choose between HTML, RTF, and PDF output and enter appropriate settings for each. The sample job uses PDF output. When Use PDF is selected in the ODS Result field, new fields are displayed. These include Location, Author, Keywords, Subject, and Additional options for ODS PDF statement. (The path specified in the Location field is relative to the SAS Application Server that executes the job.) These fields are shown in the following display:
    Sample ODS Options
    Sample ODS Options

Run the Job and View the Output

Perform the following steps to run the job and view the output:
  1. Right-click on an empty area of the job, and click Run in the pop-up menu. SAS Data Integration Studio generates code for the job and submits it to the SAS Application Server for execution. The following display shows a successful run of a sample job:
    Successfully Completed Sample Job
    Successfully Completed Sample Job
  2. If error messages are displayed on the Status tab, read and respond to the messages as needed.
  3. To view the frequency analysis, click the Output tab in the Job Editor window. The following display shows the analysis for the sample job:
    Sample Output in the Output Tab
    Sample Output in the Output Tab
  4. To view the target table, right-click the target and select Open. The following display shows the target table data for the sample job:
    Sample Target Table Data
    Sample Target Table Data
  5. Open the PDF document that you created and saved earlier. A portion displays the same as the One-Way Frequency example. The following display illustrates a sample report based on the frequency data that is not available to the One-Way Frequency:
    Sample PDF Output
    Sample PDF Output