GCHART Procedure

About Chart Statistics

Definition: Chart Statistics

The chart statistic is the statistical value calculated for the chart variable and represented by each block, bar, or slice. The GCHART procedure calculates six chart statistics; the default statistic is frequency.
The examples given in the descriptions of these statistics assume a data set with two variables, CITY and SALES. The values of CITY are Denver, Seattle, and Tokyo. There are 21 observations: seven for Denver, nine for Seattle, and five for Tokyo.

Frequency

The frequency statistic is the total number of observations in the data set for each midpoint. For example, seven observations of the chart variable, CITY, contain the value Denver, so the frequency for the Denver midpoint is 7.

Cumulative Frequency

The cumulative frequency statistic adds the frequency for the current midpoint to the frequency of all of the preceding midpoints. For example, the frequency for the Denver midpoint is 7, and the frequency for the next midpoint, Seattle, is 9, so the cumulative frequency for Seattle is 16.
You cannot request cumulative frequency with the DONUT, PIE, PIE3D, or STAR statements.

Percentage

The percentage statistic is calculated by dividing the frequency for each midpoint by the total frequency count for all midpoints in the chart or group and multiplying it by 100. For example, the frequency count for the Denver midpoint is 7 and the total frequency count for the chart is 21, so the percentage statistic for Denver is 33.3%.

Cumulative Percentage

The cumulative percentage statistic adds the percentage for the current midpoint to the percentage for all of the preceding midpoints in the chart or group. For example, the percentage for the Denver midpoint is 33.3, and the percentage for the next midpoint, Seattle, is 42.9, so the cumulative percentage for Seattle is 76.2.
You cannot request cumulative percentage with the DONUT, PIE, PIE3D, or STAR statements.

Sum

The sum statistic is the total of the values for the SUMVAR= variable for each midpoint. For example, if you specify SUMVAR=SALES, and the values of the SALES variable for the seven Denver observations are 8734, 982, 1504, 3207, 4502, 624, and 918, then the sum statistic for the Denver midpoint is 20,471.
You must use the SUMVAR= option to specify the variable for which you want the sum statistic.

Mean

The mean statistic is the average of the values for the SUMVAR= variable for each midpoint. For example, if TYPE=MEAN and SUMVAR=SALES, the mean statistic for the Denver midpoint is 2924.42.
You must use the SUMVAR= option to specify the variable for which you want the mean statistic.

Calculating Weighted Statistics

By default, each observation is counted only once in the calculation of the chart statistic. To calculate weighted statistics in which an observation can be counted more than once, use the FREQ= option. This option identifies a variable whose values are used as a multiplier for the observation in the calculation of the statistic. If the value of the FREQ= variable is missing, 0, or negative, the observation is excluded from the calculation.
If you use the SUMVAR= option, then the SUMVAR= variable value for an observation is multiplied by the FREQ= variable value for that observation when calculating the chart statistic.
For example, to use a variable called COUNT to produce weighted statistics, assign FREQ=COUNT. If you also assign the variable HEIGHT to the SUMVAR= option, then the following table shows how the values of COUNT and HEIGHT would affect the statistic calculation:
Value of COUNT
Value of HEIGHT
Number of Times the Observation is Used
Value Used for HEIGHT
1
55
1
55
5
65
5
325
.
63
0
-
-3
60
0
-
By default, the percentage and cumulative percentage statistics are calculated based on the frequency. If you want to chart a percentage or cumulative percentage based on a sum, you can use the FREQ= option to specify a variable to use for the “sum” calculation. You can also specify the PCT statistic, as shown in this example:
freq=count type=pct
Because the variable that is used by the FREQ= option determines the number of times an observation is counted, the value of COUNT is the equivalent of the sum statistic.
See also the descriptions of the TYPE=, SUMVAR=, and FREQ= options for the action statements.