Example: Order Categories in a Bar Chart

To create a bar chart of the category variable in the Hurricanes data set:

  1. Open the Hurricanes data set.

    Note that the column heading for the category variable displays Nom to indicate that the variable is nominal.

  2. Create a bar chart of the category variable.

    The bar chart is shown in Figure 11.7. Note that the first category consists of missing values, and the other categories appear in standard ASCII order.

    Figure 11.7 Standard Ordering of the Category Data
    Standard Ordering of the Category Data

    When you explore data, it is useful to be able to reorder data categories. The next step arranges the bar chart categories according to frequency counts.

  3. Right-click in the data table on the column heading for the category variable. Select Ordering by Frequency, as shown in Figure 11.8.

    Figure 11.8 Ordering by Frequency Count
    Ordering by Frequency Count

    The bar chart automatically updates, as shown in Figure 11.9. Note that the first bar still represents missing values, but that the remaining bars are ordered by their frequency counts. This presentation of the plot makes it easier to compare the relative frequencies of categories.

    Note that the column heading for the category variable now displays Ord to indicate that this variable has a nonstandard ordering.

    Figure 11.9 The Category Data Ordered by Frequency Count
    The Category Data Ordered by Frequency Count

    The next step arranges the bar chart categories according to the data order of the seven nonmissing categories.

  4. Right-click in the data table on the column heading for the category variable. Select Ordering by Data, as shown in Figure 11.10.

    Figure 11.10 Ordering by Data Set Position
    Ordering by Data Set Position

    The bar chart automatically updates, as shown in Figure 11.11. As always, the first bar represents missing values. The TD category is ordered next, because TD is the first nonmissing value for the category variable. The next category is TS, because as you traverse the data (starting from the first observation) the next unique value you encounter is TS (the eighth observation). The remaining categories are Cat1 (the 72nd observation), Cat2 (the 148th observation), Cat3 (the 149th observation), Cat4 (the 155th observation), and Cat5 (the 157th observation).

Figure 11.11 The Category Data Ordered by Data Set Position
The Category Data Ordered by Data Set Position

Arranging values by their data order is sometimes useful when the values are inherently ordered. For example, suppose you have a variable Y with the values Low, Medium, and High. The ASCII order for these categories is {High, Low, Medium}. A plot that displays the categories in this order might be confusing.

To deal with this problem:

  1. Create a numerical indicator variable with the values {1, 2, 3} that corresponds to observations with the values {Low, Medium, High} for Y. The section Custom Transformations describes how to create an indicator variable.

  2. Sort the data by the indicator variable.

  3. Save the sorted data.

  4. Close your workspace.

  5. Open the sorted data.

  6. Right-click the column heading for the variable, and select Ordering by Data.

Plots of the Y variable display the categories in the order {Low, Medium, High}.

Although you can use the previous steps to order any single variable, you might not be able to order multiple variables simultaneously using this method. In that case, you should consult the online Help and read about the DataObject.SetVarValueOrder method.