Categorization Plots and Charts

About Categorization Plots and Charts

Categorization plots and charts produce a series of graph elements, one for each selected category of cases. For example, the relation between the age and the risk of a heart attack might differ between males and females. Categorization plots and charts can reveal patterns, complex interactions, exceptions, and anomalies.
You can use the SGPLOT and SGPANEL procedures to produce a variety of categorization plots and charts. The plot and chart statements include many options for controlling how the output is displayed. The options that are available depend on the plot type. The following sections describe each type and the options that are available.
The categorization plots are described in the following sections. If you run the examples, your output might differ somewhat depending on the size of your graphics. The examples here were specified to be a particular size using the following statement:
ods graphics on / width=4in;

About Bar Charts

Overview of Standard and Parameterized Bar Charts

Bar charts use bars to represent statistics based on the values of a category variable. Bar charts are useful for displaying magnitudes and emphasizing differences.
You can use the SGPLOT and SGPANEL procedures to create the following:
  • horizontal and vertical bar charts that summarize the values of a category variable.
  • parameterized horizontal and vertical bar charts that require a response variable in addition to the category variable. The response variable contains pre-summarized computed values such as a sum or a mean for each unique value of the category variable.
Options are available that enable you to customize both types of bar charts and enhance their appearance. For example, you can do the following:
  • control the visual attributes of the bars, such as bar width, fill color, fill skin, and outlines.
  • add data labels and specify font attributes for the labels.
  • group the data by the values of a variable. A separate plot is created for each unique value of the grouping variable. The plot elements for each group value are automatically distinguished by different visual attributes.
  • control the display of grouped bars. For example, you can specify the width of each cluster.
  • specify an amount to offset graph elements from the category midpoints or from the discrete axis tick marks.
  • specify legend labels, plot transparency, and URLs for Web pages to be displayed when parts of the plot are selected within an HTML page.
  • specify the value of an ID variable in an attribute map data set. You specify this option only if you are using an attribute map to control visual attributes of the graph.
Note: This list does not include all available options.

Bar Chart Examples

The following examples show statistics for different categories of smokers. The examples use the SGPLOT procedure to create a horizontal and a vertical bar chart, respectively. By default, the charts show the frequency for each category. The examples specify an optional response variable to show the average age at death for each category rather than the frequency.
Horizontal bar chart
proc sgplot data=sashelp.heart;
  hbar smoking_status /
    response=ageatdeath
    stat=mean;
run;
Vertical bar chart
proc sgplot data=sashelp.heart;
  vbar smoking_status /
    response=ageatdeath
    stat=mean;
run;
The following two examples use the SGPANEL procedure to create a horizontal and a vertical chart, respectively. The bar charts are paneled by gender.
Horizontal bar chart panel
proc sgpanel data=sashelp.heart;
  panelby sex;
  hbar smoking_status /
    response=ageatdeath
    stat=mean;
run;
Horizontal bar chart panel
proc sgpanel data=sashelp.heart;
  panelby sex;
  vbar smoking_status /
    response=ageatdeath
    stat=mean;
run;
Bar charts includes options that are not applicable to parameterized bar charts. For example, you can do the following:
  • specify the response variable and the statistic to use for its axis
  • specify the order in which the response values are arranged
  • show limit lines, specify the statistic to use for the limit lines, and specify the confidence level
  • for grouped data, you can specify whether the bars are stacked or clustered
  • specify how many times observations are repeated for computational purposes
Note: This list does not include all available options.

Parameterized Bar Chart Examples

The following examples show height averages for a class of students. The averages are obtained using the following program.
proc means data=sashelp.class alpha=.05 clm mean std;
  class age sex;
  var height;
  output out=classMean uclm=uclm lclm=lclm mean=mean;
run;
The following two examples use the SGPLOT procedure to create a horizontal and a vertical chart, respectively. The response variable contains the computed mean values that were created with the MEANS procedure.
Parameterized Horizontal bar chart
proc sgplot data=classMean;
  hbarparm category=age response=mean;
run;
Parameterized vertical bar chart
proc sgplot data=classMean;
  vbarparm category=age response=mean;
run;
The following two examples use the SGPANEL procedure to create horizontal and vertical bar charts, respectively. The charts are paneled by gender.
Parameterized Horizontal bar chart
proc sgpanel data=classMean;
  panelby sex;
  hbarparm category=age response=mean;
run;
Parameterized Horizontal bar chart
proc sgpanel data=classMean;
  panelby sex;
  vbarparm category=age response=mean;
run;
You can also assign variables to the upper and lower limits of the bar chart. Parameterized bar charts enable you to pass in your own precomputed limits.
Parameterized bar chart with CLM limits
proc sgplot data=classMean;
  hbarparm category=age response=mean /
    limitlower=lclm
    limitupper=uclm;
run;

See Also

HBAR Statement (SGPANEL procedure)
VBAR Statement (SGPANEL procedure)
HBAR Statement (SGPLOT procedure)
VBAR Statement (SGPLOT procedure)
HBARPARM Statement (SGPANEL procedure)
VBARPARM Statement (SGPANEL procedure)
HBARPARM Statement (SGPLOT procedure)
VBARPARM Statement (SGPLOT procedure)

About Dot Plots

Dot plots summarize horizontally the values of a category variable. By default, each dot represents the frequency for each value of the category variable.
The following examples show the frequency of different weights of patients in a study. The examples use the SGPLOT and the SGPANEL procedures.
Dot plot
proc sgplot  data=sashelp.heart;
  dot weight;
run;
Dot panel
proc sgpanel  data=sashelp.heart;
  panelby sex;
  dot weight;
run;
Options are available that enable you to customize the dot plot and enhance its appearance. For example, you can do the following:
  • specify an optional response variable and show the mean, the sum, or the frequency for that variable. You can also specify the order in which the response values are arranged.
  • show limits for the plot. You can also specify the statistic for the limit lines and visual attributes of the lines.
  • specify the color, size, and symbol for the markers.
  • add data labels and specify font attributes for the labels.
  • control the display of grouped markers, lines, and bars. For example, you can specify whether the groups are overlaid or clustered, and the ordering of dots within a group.
  • specify an amount to offset graph elements from the category midpoints or from the discrete axis tick marks.
  • specify legend labels, plot transparency, and URLs for Web pages to be displayed when parts of the plot are selected within an HTML page.
Note: This list does not include all available options.

See Also

DOT Statement (SGPANEL procedure)
DOT Statement (SGPLOT procedure)

About Line Charts

Line charts displays information as a series of data points connected by straight line segments. The SGPLOT and the SGPANEL procedures have separate statements for creating horizontal and vertical line charts.
The following examples show mean weight values for a class. Examples are provided for the SGPLOT and the SGPANEL procedures. The examples specify an optional response variable and use the mean statistic for that variable. The examples also add data point markers.
These two examples use the SGPLOT procedure to create a horizontal and a vertical chart, respectively.
Horizontal line chart
proc sgplot data=sashelp.class;
  hline age /
    response=height
    stat=mean
    markers;
run;
Vertical line chart
proc sgplot data=sashelp.class;
  vline age /
    response=height
    stat=mean
    markers;
run;
The following two examples use the SGPANEL procedure to create panels of horizontal and vertical charts, respectively.
Horizontal line chart panel
proc sgpanel data=sashelp.class;
  panelby sex;
  hline age /
    response=height
    stat=mean
    markers;
run;
Horizontal line chart panel
proc sgpanel data=sashelp.class;
  panelby sex;
  vline age /
    response=height
    stat=mean
    markers;
run;
Options are available that enable you to customize the line chart and enhance its appearance. For example, you can do the following:
  • specify an optional response variable and show the mean, the sum, or the frequency for that variable. You can also specify the order in which the response values are arranged.
  • show limits for the chart. You can also specify the statistic for the limit lines and visual attributes of the lines.
  • add data point markers and specify the color, size, and symbol for the markers.
  • add curve and data labels and specify font attributes for the labels.
  • control the display of grouped lines. For example, you can specify whether the groups are overlaid or clustered, the width of each cluster, and the ordering of lines within a group.
  • specify an amount to offset graph elements from the category midpoints or from the discrete axis tick marks.
  • specify legend labels, plot transparency, and URLs for Web pages to be displayed when parts of the plot are selected within an HTML page.
  • assign the category variable, the response variable, or both variables to the secondary axis (X2 or Y2). This option is available only for the SGPLOT procedure.
  • specify the value of an ID variable in an attribute map data set. You specify this option only if you are using an attribute map to control visual attributes of the graph.
Note: This list does not include all available options.

See Also

HLINE Statement (SGPANEL procedure)
VLINE Statement (SGPANEL procedure)
HLINE Statement (SGPLOT procedure)
VLINE Statement (SGPLOT procedure)

About Waterfall Charts (Preproduction)

Waterfall charts show how the value of a variable increases or decreases until it reaches a final value. In the chart, bars represent an initial value of Y and a series of intermediate values identified by X leading to a final value of Y. Waterfall charts are available only for the SGPLOT procedure.
The following example shows average failure counts for capacitors.
Waterfall chart
proc sgplot data=sashelp.failure;
  waterfall category=cause response=count
   / stat=mean;
run;
Options are available that enable you to customize the waterfall chart and enhance its appearance. For example, you can do the following:
  • specify the statistic for the response variable.
  • specify an initial bar for the chart. You can also specify the tick value that is used for the initial bar and visual attributes of the bar.
  • control the appearance of the bars. For example, you can do the following:
    • show or hide the bar outline
    • show or hide the bar fill
    • use a special effect (data skin) for the fill
    • specify a variable to use for the bar colors
    • specify attributes separately for the final bar
  • add data labels and specify font attributes for the labels.
  • specify plot transparency and URLs for Web pages to be displayed when parts of the plot are selected within an HTML page.
Note: This list does not include all available options.

See Also