• Print  |
  • Feedback  |

Knowledge Base


TS-604

Producing errorbars on a vertical or horizontal bar chart using PROC GCHART

The HBAR and VBAR statements support the following options in Release 6.07 and later of the SAS System.

CLM=confidence-level

The CLM= option draws confidence intervals (error bars) on a horizontal or vertical bar chart with the specified percentage confidence level. Values for confidence-level must be greater than or equal to 50 and less than 100. The default is 95. By default, CLM= draws the intervals using ERRORBAR=BOTH. See the ERRORBAR= option for details on how error bars are computed and drawn.

ERRORBAR=BARS | BOTH | TOP

The ERRORBAR= option draws confidence intervals on a horizontal or vertical bar chart for either: - The mean of the SUMVAR= variable for each midpoint if you specify TYPE=MEAN. - The percentage of observations assigned to each midpoint if you specify TYPE=PCT with no SUMVAR= option. The ERRORBAR= option may not be used with values of the TYPE= option other than MEAN or PCT. Valid values for ERRORBAR= are:

BARS

draws error bars as bars half the width of the main bars.

BOTH

draws error bars as two ticks joined by a line (default).

TOP

draws the error bar as a tick for the upper confidence limit that is joined to the top of the bar by a line.

By default, ERRORBAR= uses a confidence level of 95 percent. You can specify different confidence levels with the CLM= option. When you use ERRORBAR= with TYPE=PCT, the confidence interval is based on a normal approximation. Let TOTAL be the total number of observations, and PCT be the percentage assigned to a given midpoint.

The standard error of the percentage is approximated as:

APSTDERR = 100 * SQRT( (PCT/100) * (1-(PCT/100)) / TOTAL )

Let LEVEL be the confidence level specified using the CLM= option, with a default value of 95. The upper confidence limit for the percentage is computed as:

UCLP = PCT + APSTDERR * PROBIT( 1-(1-LEVEL/100)/2 )

The lower confidence limit for the percentage is computed as:

LCLP = PCT - APSTDERR * PROBIT( 1-(1-LEVEL/100)/2 )

When you use ERRORBAR= with TYPE=MEAN, the sum variable must have at least two non-missing values for each midpoint. If the GROUP= option is used, each midpoint within a group must also have two non-missing values.

Let N be the number of observations assigned to a midpoint, MEAN be the mean of those observation, and STD be the standard deviation of the observations. The standard error of the mean is computed as:

STDERR = STD / SQRT(N)

Let LEVEL be the confidence level specified using the CLM= option, with a default value of 95. The upper confidence limit for the mean is computed as:

UCLM = MEAN + STDERR * TINV( 1-(1-LEVEL/100)/2, N-1)

The lower confidence limit for the mean is computed as: LCLM = MEAN - STDERR * TINV( 1-(1-LEVEL/100)/2, N-1)

If you want the error bars to represent a given number C of standard errors instead of a confidence interval, and if the number of observations assigned to each midpoint is the same, then you can find the appropriate value for the CLM= option by running a DATA step.

For example, if you want error bars that represent one standard error (C=1) with a sample size of N, you can run the following DATA step to compute the appropriate value for the CLM= option and assign that value to a macro variable &LEVEL:

    DATA _NULL_;
    C = 1;
    N = 10;
    LEVEL = 100 * (1 - 2 * (1 - PROBT( C, N-1)));
    PUT _ALL_;
    CALL SYMPUT('LEVEL',PUT(LEVEL,BEST12.));
    RUN;
    
When you run the GCHART procedure, you can specify CLM=&LEVEL.

Note: This does not work precisely if different midpoints have different numbers of observations. However, choosing an average value for N may yield sufficiently accurate results for graphical purposes if the sample sizes are large or do not vary much.