Techniques for Exploring Data |
Example: Order Categories in a Bar Chart |
To create a bar chart of the category variable in the Hurricanes data set:
Open the Hurricanes data set.
Note that the column heading for the category variable displays Nom to indicate that the variable is nominal.
Create a bar chart of the category variable.
The bar chart is shown in Figure 11.7. Note that the first category consists of missing values, and the other categories appear in standard linguistic order.
When you explore data, it is useful to be able to reorder data categories. The next step arranges the bar chart categories according to frequency counts.
Right-click in the data table on the column heading for the category variable. Select Ordering by Frequency, as shown in Figure 11.8.
Note that the column heading for the category variable now displays Ord to indicate that this variable has a nonstandard ordering.
The next step arranges the bar chart categories according to the data order of the seven nonmissing categories.
Right-click in the data table on the column heading for the category variable. Select Ordering by Data, as shown in Figure 11.10.
The bar chart automatically updates, as shown in Figure 11.11. As always, the first bar represents missing values. The TD category is ordered next, because TD is the first nonmissing value for the category variable. The next category is TS, because as you traverse the data (starting from the first observation) the next unique value you encounter is TS (the eighth observation). The remaining categories are Cat1 (the 72nd observation), Cat2 (the 148th observation), Cat3 (the 149th observation), Cat4 (the 155th observation), and Cat5 (the 157th observation).
Arranging values by their data order is sometimes useful when the values are inherently ordered. For example, suppose you have a variable Y with the values Low, Medium, and High. The linguistic order for these categories is {High, Low, Medium}. A plot that displays the categories in this order might be confusing.
To deal with this problem:
Create a numerical indicator variable with the values {1, 2, 3} that corresponds to observations with the values {Low, Medium, High} for Y. The section Custom Transformations describes how to create an indicator variable.
Sort the data by the indicator variable.
Save the sorted data.
Close your workspace.
Open the sorted data.
Right-click the column heading for the variable, and select Ordering by Data.
Plots of the Y variable display the categories in the order {Low, Medium, High}.
Although you can use the previous steps to order any single variable, you might not be able to order multiple variables simultaneously using this method. In that case, you should consult the online Help and read about the DataObject.SetVarValueOrder method.
Copyright © SAS Institute, Inc. All Rights Reserved.