The ANOVA Procedure

MEANS Statement

MEANS effects </ options> ;

PROC ANOVA can compute means of the dependent variables for any effect that appears on the right-hand side in the MODEL statement.

You can use any number of MEANS statements, provided that they appear after the MODEL statement. For example, suppose A and B each have two levels. Then, if you use the following statements

proc anova;
   class A B;
   model Y=A B A*B;
   means A B / tukey;
   means A*B;
run;

means, standard deviations, and Tukey’s multiple comparison tests are produced for each level of the main effects A and B, and just the means and standard deviations for each of the four combinations of levels for A*B. Since multiple comparisons options apply only to main effects, the single MEANS statement

means A B A*B / tukey;

produces the same results.

Options are provided to perform multiple comparison tests for only main effects in the model. PROC ANOVA does not perform multiple comparison tests for interaction terms in the model; for multiple comparisons of interaction terms, see the LSMEANS statement in Chapter 42: The GLM Procedure.

Table 25.4 summarizes the options available in the MEANS statement.

Table 25.4: Options Available in the MEANS Statement

Option

Description

Perform multiple comparison tests

BON

Performs Bonferroni t tests of differences between means for all main effect means

DUNCAN

Performs Duncan’s multiple range test on all main effect means

DUNNETT

Performs Dunnett’s two-tailed t test

DUNNETTL

Performs Dunnett’s one-tailed t test, testing if any treatment is significantly less than the control

DUNNETTU

Performs Dunnett’s one-tailed t test, testing if any treatment is significantly greater than the control

GABRIEL

Performs Gabriel’s multiple-comparison procedure on all main effect means

REGWQ

Performs the Ryan-Einot-Gabriel-Welsch multiple range test

SCHEFFE

Performs Scheffé’s multiple-comparison procedure

SIDAK

Performs pairwise t tests on differences between means with levels adjusted according to Sidak’s inequality

SMM or GT2

Performs pairwise comparisons based on the studentized maximum modulus and Sidak’s uncorrelated-t inequality

SNK

Performs the Student-Newman-Keuls multiple range test

T or LSD

Performs pairwise t tests

TUKEY

Performs Tukey’s studentized range test (HSD)

WALLER

Performs the Waller-Duncan k-ratio t test

Specify additional details for multiple comparison tests

ALPHA=

Specifies the level of significance for comparisons among the means.

CLDIFF

Presents results from options as confidence intervals

CLM

Options as intervals for the mean of each level of the variables specified

E=

Specifies the error mean square used in the multiple comparisons

KRATIO=

Specifies the Type 1/Type 2 error seriousness ratio for the Waller-Duncan test

LINES

Presents results of options by listing the means in descending order and indicating nonsignificant subsets by line segments

NOSORT

Prevents the means from being sorted into descending order

Test for homogeneity of variances

HOVTEST

Requests a homogeneity of variance test

Compensate for heterogeneous variances

WELCH

Requests the Welch (1951) variance-weighted one-way ANOVA


Descriptions of these options follow. For a further discussion of these options, see the section Multiple Comparisons in Chapter 42: The GLM Procedure.

ALPHA=p

specifies the level of significance for comparisons among the means. By default, ALPHA=0.05. You can specify any value greater than 0 and less than 1.

BON

performs Bonferroni t tests of differences between means for all main effect means in the MEANS statement. See the CLDIFF and LINES options, which follow, for a discussion of how the procedure displays results.

CLDIFF

presents results of the BON, GABRIEL, SCHEFFE, SIDAK, SMM, GT2, T, LSD, and TUKEY options as confidence intervals for all pairwise differences between means, and the results of the DUNNETT, DUNNETTU, and DUNNETTL options as confidence intervals for differences with the control. The CLDIFF option is the default for unequal cell sizes unless the DUNCAN, REGWQ, SNK, or WALLER option is specified.

CLM

presents results of the BON, GABRIEL, SCHEFFE, SIDAK,SMM, T, and LSD options as intervals for the mean of each level of the variables specified in the MEANS statement. For all options except GABRIEL, the intervals are confidence intervals for the true means. For the GABRIEL option, they are comparison intervals for comparing means pairwise: in this case, if the intervals corresponding to two means overlap, the difference between them is insignificant according to Gabriel’s method.

DUNCAN

performs Duncan’s multiple range test on all main effect means given in the MEANS statement. See the LINES option for a discussion of how the procedure displays results.

DUNNETT <(formatted-control-values)>

performs Dunnett’s two-tailed t test, testing if any treatments are significantly different from a single control for all main effects means in the MEANS statement.

To specify which level of the effect is the control, enclose the formatted value in quotes in parentheses after the keyword. If more than one effect is specified in the MEANS statement, you can use a list of control values within the parentheses. By default, the first level of the effect is used as the control. For example,

means a / dunnett('CONTROL');

where CONTROL is the formatted control value of A. As another example,

means a b c / dunnett('CNTLA' 'CNTLB' 'CNTLC');

where CNTLA, CNTLB, and CNTLC are the formatted control values for A, B, and C, respectively.

DUNNETTL <(formatted-control-value)>

performs Dunnett’s one-tailed t test, testing if any treatment is significantly less than the control. Control level information is specified as described previously for the DUNNETT option.

DUNNETTU <(formatted-control-value)>

performs Dunnett’s one-tailed t test, testing if any treatment is significantly greater than the control. Control level information is specified as described previously for the DUNNETT option.

E=effect

specifies the error mean square used in the multiple comparisons. By default, PROC ANOVA uses the residual Mean Square (MS). The effect specified with the E= option must be a term in the model; otherwise, the procedure uses the residual MS.

GABRIEL

performs Gabriel’s multiple-comparison procedure on all main effect means in the MEANS statement. See the CLDIFF and LINES options for discussions of how the procedure displays results.

GT2

see the SMM option.

HOVTEST

HOVTEST=BARTLETT
HOVTEST=BF
HOVTEST=LEVENE <(TYPE=ABS | SQUARE)>
HOVTEST=OBRIEN <(W=number )>

requests a homogeneity of variance test for the groups defined by the MEANS effect. You can optionally specify a particular test; if you do not specify a test, Levene’s test (Levene, 1960) with TYPE=SQUARE is computed. Note that this option is ignored unless your MODEL statement specifies a simple one-way model.

The HOVTEST=BARTLETT option specifies Bartlett’s test (Bartlett, 1937), a modification of the normal-theory likelihood ratio test.

The HOVTEST=BF option specifies Brown and Forsythe’s variation of Levene’s test (Brown and Forsythe, 1974).

The HOVTEST=LEVENE option specifies Levene’s test (Levene, 1960), which is widely considered to be the standard homogeneity of variance test. You can use the TYPE= option in parentheses to specify whether to use the absolute residuals (TYPE=ABS) or the squared residuals (TYPE=SQUARE) in Levene’s test. The default is TYPE=SQUARE.

The HOVTEST=OBRIEN option specifies O’Brien’s test (O’Brien, 1979), which is basically a modification of HOVTEST=LEVENE(TYPE=SQUARE). You can use the W= option in parentheses to tune the variable to match the suspected kurtosis of the underlying distribution. By default, W=0.5, as suggested by O’Brien (1979, 1981).

See the section Homogeneity of Variance in One-Way Models in Chapter 42: The GLM Procedure, for more details on these methods. Example 42.10 in the same chapter illustrates the use of the HOVTEST and WELCH options in the MEANS statement in testing for equal group variances.

KRATIO=value

specifies the Type 1/Type 2 error seriousness ratio for the Waller-Duncan test. Reasonable values for KRATIO are 50, 100, and 500, which roughly correspond for the two-level case to ALPHA levels of 0.1, 0.05, and 0.01. By default, the procedure uses the default value of 100.

LINES

presents results of the BON, DUNCAN, GABRIEL, REGWQ, SCHEFFE, SIDAK,SMM, GT2, SNK, T, LSD TUKEY, and WALLER options by listing the means in descending order and indicating nonsignificant subsets by line segments beside the corresponding means. The LINES option is appropriate for equal cell sizes, for which it is the default. The LINES option is also the default if the DUNCAN, REGWQ, SNK, or WALLER option is specified, or if there are only two cells of unequal size. If the cell sizes are unequal, the harmonic mean of the cell sizes is used, which might lead to somewhat liberal tests if the cell sizes are highly disparate. The LINES option cannot be used in combination with the DUNNETT, DUNNETTL, or DUNNETTU option. In addition, the procedure has a restriction that no more than 24 overlapping groups of means can exist. If a mean belongs to more than 24 groups, the procedure issues an error message. You can either reduce the number of levels of the variable or use a multiple comparison test that allows the CLDIFF option rather than the LINES option.

LSD

see the T option.

NOSORT

prevents the means from being sorted into descending order when the CLDIFF or CLM option is specified.

REGWQ

performs the Ryan-Einot-Gabriel-Welsch multiple range test on all main effect means in the MEANS statement. See the LINES option for a discussion of how the procedure displays results.

SCHEFFE

performs Scheffé’s multiple-comparison procedure on all main effect means in the MEANS statement. See the CLDIFF and LINES options for discussions of how the procedure displays results.

SIDAK

performs pairwise t tests on differences between means with levels adjusted according to Sidak’s inequality for all main effect means in the MEANS statement. See the CLDIFF and LINES options for discussions of how the procedure displays results.

SMM
GT2

performs pairwise comparisons based on the studentized maximum modulus and Sidak’s uncorrelated-t inequality, yielding Hochberg’s GT2 method when sample sizes are unequal, for all main effect means in the MEANS statement. See the CLDIFF and LINES options for discussions of how the procedure displays results.

SNK

performs the Student-Newman-Keuls multiple range test on all main effect means in the MEANS statement. See the LINES option for a discussion of how the procedure displays results.

T
LSD

performs pairwise t tests, equivalent to Fisher’s least-significant-difference test in the case of equal cell sizes, for all main effect means in the MEANS statement. See the CLDIFF and LINES options for discussions of how the procedure displays results.

TUKEY

performs Tukey’s studentized range test (HSD) on all main effect means in the MEANS statement. (When the group sizes are different, this is the Tukey-Kramer test.) See the CLDIFF and LINES options for discussions of how the procedure displays results.

WALLER

performs the Waller-Duncan k-ratio t test on all main effect means in the MEANS statement. See the KRATIO= option for information about controlling details of the test, and see the LINES option for a discussion of how the procedure displays results.

WELCH

requests Welch’s (1951) variance-weighted one-way ANOVA. This alternative to the usual analysis of variance for a one-way model is robust to the assumption of equal within-group variances. This option is ignored unless your MODEL statement specifies a simple one-way model.

Note that using the WELCH option merely produces one additional table consisting of Welch’s ANOVA. It does not affect all of the other tests displayed by the ANOVA procedure, which still require the assumption of equal variance for exact validity.

See the section Homogeneity of Variance in One-Way Models in Chapter 42: The GLM Procedure, for more details on Welch’s ANOVA. Example 42.10 in the same chapter illustrates the use of the HOVTEST and WELCH options in the MEANS statement in testing for equal group variances.