GCHART Procedure

Understanding Midpoints

About Midpoints

Midpoints are the values of the chart variable that identify categories of data. By default, midpoints are selected or calculated by the procedure. The way the procedure handles the midpoints depends on whether the values of the chart variable are character, discrete numeric, or continuous numeric.

Character Values

A character chart variable generates a midpoint for each unique value of the variable. For example, if the chart variable CITY contains the names of three different cities, each city is a midpoint, resulting in three midpoints for the chart:
Character Midpoints
simple vertical bar chart with character midpoints
(In pie charts, midpoint values that compose a small percentage of the total for the chart might be placed in the OTHER slice and will not produce a separate midpoint.)
By default, character midpoints are arranged in alphabetic order. If a character variable has an associated format, the values are arranged in order of the formatted values.

Discrete Numeric Values

A numeric chart variable used with the DISCRETE option generates a midpoint for each unique value of the chart variable. For example, the numeric variable YEAR used with the DISCRETE option produces one midpoint for each year:
Discrete Numeric Midpoints
vertical bar chart using discrete numeric midpoints
By default, numeric midpoints are arranged in ascending order. The DISCRETE option is very useful for working with dates and numeric values with text user-defined formats. If the numeric variable has an associated format, each formatted value generates a separate midpoint. Formatted numeric variables are arranged in ascending order according to their unformatted numeric values.

Continuous Numeric Values

A continuous numeric variable generates midpoints that represent ranges of values. By default, the GCHART procedure determines the ranges, calculates the median value of each range, and displays the appropriate median value at each midpoint on the chart. A value that falls exactly halfway between two midpoints is placed in the higher range.
For example, the numeric variable AGE produces four midpoints, each of which represents a ten-year age range; the median value of the range is displayed at each midpoint:
Continuous Numeric Midpoints
vertical bar chart generated from continuous numeric data
By default, midpoints of ranges are arranged in ascending order.

Selecting and Ordering Midpoints

For character or discrete numeric values, you can use the MIDPOINTS= option to rearrange the midpoints or to exclude midpoints from the chart. For example, to change the default alphabetic order of the midpoints in Character Midpoints, specify the following:
midpoints="Tokyo" "Denver" "Seattle"
To exclude the midpoint for Denver, specify the following:
midpoints="Tokyo" "Seattle"
In this case, values excluded by the option are not included in the calculation of the chart statistic.
You can order or select discrete numeric midpoint values just as you do character values, but you omit the quotation marks when specifying numeric values.
For continuous numeric variables, use the LEVELS= or MIDPOINTS= option to
  • change the number of midpoints
  • control the range of values each midpoint represents
  • change the order of the midpoints.
To control the range of values each midpoint represents, use the MIDPOINTS= option to specify the median value of each range. For example, to select the ranges 20–29, 30–39, and 40–49, specify the following:
midpoints=25 35 45
Alternatively, to select the number of midpoints that you want and let the procedure calculate the ranges and medians, use the LEVELS= option.
You can also use formats to control the ranges of continuous numeric variables, but in that case the values are no longer continuous but discrete.
Note: You cannot use the MIDPOINTS= option to exclude continuous numeric values from the chart. Values below or above the ranges specified by the option are automatically included in the first and last midpoints, respectively. To exclude continuous numeric values from a chart, use a WHERE statement in a DATA step or the WHERE= data set option.
See also the description of the LEVELS= and MIDPOINTS= options for the appropriate statement.