The GLM Procedure

MEANS Statement

MEANS effects </ options> ;

Within each group corresponding to each effect specified in the MEANS statement, PROC GLM computes the arithmetic means and standard deviations of all continuous variables in the model (both dependent and independent). You can specify only classification effects in the MEANS statement—that is, effects that contain only classification variables.

Note that the arithmetic means are not adjusted for other effects in the model; for adjusted means, see the section LSMEANS Statement.

If you use a WEIGHT statement, PROC GLM computes weighted means; see the section Weighted Means.

You can also specify options to perform multiple comparisons. However, the MEANS statement performs multiple comparisons only for main-effect means; for multiple comparisons of interaction means, see the section LSMEANS Statement.

You can use any number of MEANS statements, provided that they appear after the MODEL statement. For example, suppose A and B each have two levels. Then, if you use the statements

proc glm;
   class A B;
   model Y=A B A*B;
   means A B / tukey;
   means A*B;
run;

the means, standard deviations, and Tukey’s multiple comparisons tests are displayed for each level of the main effects A and B, and just the means and standard deviations are displayed for each of the four combinations of levels for A*B. Since multiple comparisons tests apply only to main effects, the single MEANS statement

means A B A*B / tukey;

produces the same results.

PROC GLM does not compute means for interaction effects containing continuous variables. Thus, if you have the model

class A;
model Y=A X A*X;

then the effects X and A*X cannot be used in the MEANS statement. However, if you specify the effect A in the means statement

means A;

then PROC GLM, by default, displays within-A arithmetic means of both Y and X. You can use the DEPONLY option to display means of only the dependent variables.

means A / deponly;

If you use a WEIGHT statement, PROC GLM computes weighted means and estimates their variance as inversely proportional to the corresponding sum of weights (see the section Weighted Means). However, note that the statistical interpretation of multiple comparison tests for weighted means is not well understood. See the section Multiple Comparisons for formulas. Table 44.8 summarizes the options available in the MEANS statement.

Table 44.8: MEANS Statement Options

Option

Description

Modify output

DEPONLY

Displays only means for the dependent variables

Perform multiple comparison tests

BON

Performs Bonferroni t tests

DUNCAN

Performs Duncan’s multiple range test

DUNNETT

Performs Dunnett’s two-tailed t test

DUNNETTL

Performs Dunnett’s lower one-tailed t test

DUNNETTU

Performs Dunnett’s upper one-tailed t test

GABRIEL

Performs Gabriel’s multiple-comparison procedure

REGWQ

Performs the Ryan-Einot-Gabriel-Welsch multiple range test

SCHEFFE

Performs Scheffé’s multiple-comparison procedure

SIDAK

Performs pairwise t tests on differences between means

SMM or GT2

Performs pairwise comparisons based on the studentized maximum modulus and Sidak’s uncorrelated-t inequality

SNK

Performs the Student-Newman-Keuls multiple range test

T or LSD

Performs pairwise t tests

TUKEY

Performs Tukey’s studentized range test (HSD)

WALLER

Performs the Waller-Duncan k-ratio t test

Specify additional details for multiple comparison tests

ALPHA=

Specifies the level of significance

CLDIFF

Presents confidence intervals for all pairwise differences between means

CLM

Presents results as intervals for the mean of each level of the variables

E=

Specifies the error mean square used in the multiple comparisons

ETYPE=

Specifies the type of mean square for the error effect

HTYPE=

Specifies the MS type for the hypothesis MS

KRATIO=

Specifies the Type 1/Type 2 error seriousness ratio

LINES

Lists the means in descending order and indicating nonsignificant subsets by line segments

NOSORT

Prevents the means from being sorted into descending order

Test for homogeneity of variances

HOVTEST

Requests a homogeneity of variance test

Compensate for heterogeneous variances

WELCH

Requests the variance-weighted one-way ANOVA of Welch (1951)


The options available in the MEANS statement are described in the following list.

ALPHA=

ALPHA=p specifies the level of significance for comparisons among the means. By default, p is equal to the value of the ALPHA= option in the PROC GLM statement or 0.05 if that option is not specified. You can specify any value greater than 0 and less than 1.

BON

performs Bonferroni t tests of differences between means for all main-effect means in the MEANS statement. See the CLDIFF and LINES options for a discussion of how the procedure displays results.

CLDIFF

presents results of the BON, GABRIEL, SCHEFFE, SIDAK, SMM, GT2, T, LSD, and TUKEY options as confidence intervals for all pairwise differences between means, and the results of the DUNNETT, DUNNETTU, and DUNNETTL options as confidence intervals for differences with the control. The CLDIFF option is the default for unequal cell sizes unless the DUNCAN, REGWQ, SNK, or WALLER option is specified.

CLM

presents results of the BON, GABRIEL, SCHEFFE, SIDAK, SMM, T, and LSD options as intervals for the mean of each level of the variables specified in the MEANS statement. For all options except GABRIEL, the intervals are confidence intervals for the true means. For the GABRIEL option, they are comparison intervals for comparing means pairwise: in this case, if the intervals corresponding to two means overlap, then the difference between them is insignificant according to Gabriel’s method.

DEPONLY

displays only means for the dependent variables. By default, PROC GLM produces means for all continuous variables, including continuous independent variables.

DUNCAN

performs Duncan’s multiple range test on all main-effect means given in the MEANS statement. See the LINES option for a discussion of how the procedure displays results.

DUNNETT <(formatted-control-values)>

performs Dunnett’s two-tailed t test, testing if any treatments are significantly different from a single control for all main-effect means in the MEANS statement.

To specify which level of the effect is the control, enclose the formatted value in quotes and parentheses after the keyword. If more than one effect is specified in the MEANS statement, you can use a list of control values within the parentheses. By default, the first level of the effect is used as the control. For example:

means A  / dunnett('CONTROL');

where CONTROL is the formatted control value of A. As another example:

means A B C / dunnett('CNTLA' 'CNTLB' 'CNTLC');

where CNTLA, CNTLB, and CNTLC are the formatted control values for A, B, and C, respectively.

DUNNETTL <(formatted-control-value)>

performs Dunnett’s one-tailed t test, testing if any treatment is significantly less than the control. Control level information is specified as described for the DUNNETT option.

DUNNETTU <(formatted-control-value)>

performs Dunnett’s one-tailed t test, testing if any treatment is significantly greater than the control. Control level information is specified as described for the DUNNETT option.

E=effect

specifies the error mean square used in the multiple comparisons. By default, PROC GLM uses the overall residual or error mean square (MS). The effect specified with the E= option must be a term in the model; otherwise, the procedure uses the residual MS.

ETYPE=n

specifies the type of mean square for the error effect. When you specify E=effect, you might need to indicate which type (1, 2, 3, or 4) of MS is to be used. The n value must be one of the types specified in or implied by the MODEL statement. The default MS type is the highest type used in the analysis.

GABRIEL

performs Gabriel’s multiple-comparison procedure on all main-effect means in the MEANS statement. See the CLDIFF and LINES options for discussions of how the procedure displays results.

GT2

See the SMM option.

HOVTEST

HOVTEST=BARTLETT
HOVTEST=BF
HOVTEST=LEVENE <( TYPE= ABS | SQUARE )>
HOVTEST=OBRIEN <( W=number )>

requests a homogeneity of variance test for the groups defined by the MEANS effect. You can optionally specify a particular test; if you do not specify a test, Levene’s test (Levene, 1960) with TYPE=SQUARE is computed. Note that this option is ignored unless your MODEL statement specifies a simple one-way model.

The HOVTEST=BARTLETT option specifies Bartlett’s test (Bartlett, 1937), a modification of the normal-theory likelihood ratio test.

The HOVTEST=BF option specifies Brown and Forsythe’s variation of Levene’s test (Brown and Forsythe, 1974).

The HOVTEST=LEVENE option specifies Levene’s test (Levene, 1960), which is widely considered to be the standard homogeneity of variance test. You can use the TYPE= option in parentheses to specify whether to use the absolute residuals (TYPE=ABS) or the squared residuals (TYPE=SQUARE) in Levene’s test. TYPE=SQUARE is the default.

The HOVTEST=OBRIEN option specifies O’Brien’s test (O’Brien, 1979), which is basically a modification of HOVTEST=LEVENE(TYPE=SQUARE). You can use the W= option in parentheses to tune the variable to match the suspected kurtosis of the underlying distribution. By default, W=0.5, as suggested by O’Brien (1979, 1981).

See the section Homogeneity of Variance in One-Way Models for more details on these methods. Example 44.10 illustrates the use of the HOVTEST and WELCH options in the MEANS statement in testing for equal group variances and adjusting for unequal group variances in a one-way ANOVA.

HTYPE=n

specifies the MS type for the hypothesis MS. The HTYPE= option is needed only when the WALLER option is specified. The default HTYPE= value is the highest type used in the model.

KRATIO=value

specifies the Type 1/Type 2 error seriousness ratio for the Waller-Duncan test. Reasonable values for the KRATIO= option are 50, 100, 500, which roughly correspond for the two-level case to ALPHA levels of 0.1, 0.05, and 0.01, respectively. By default, the procedure uses the value of 100.

LINES

presents results of the BON, DUNCAN, GABRIEL, REGWQ, SCHEFFE, SIDAK, SMM, GT2, SNK, T, LSD, TUKEY, and WALLER options by listing the means in descending order and indicating nonsignificant subsets by line segments beside the corresponding means. The LINES option is appropriate for equal cell sizes, for which it is the default. The LINES option is also the default if the DUNCAN, REGWQ, SNK, or WALLER option is specified, or if there are only two cells of unequal size. The LINES option cannot be used in combination with the DUNNETT, DUNNETTL, or DUNNETTU option. In addition, the procedure has a restriction that no more than 24 overlapping groups of means can exist. If a mean belongs to more than 24 groups, the procedure issues an error message. You can either reduce the number of levels of the variable or use a multiple comparison test that allows the CLDIFF option rather than the LINES option.

Note: If the cell sizes are unequal, the harmonic mean of the cell sizes is used to compute the critical ranges. This approach is reasonable if the cell sizes are not too different, but it can lead to liberal tests if the cell sizes are highly disparate. In this case, you should not use the LINES option for displaying multiple comparisons results; use the TUKEY and CLDIFF options instead.

LSD

See the T option.

NOSORT

prevents the means from being sorted into descending order when the CLDIFF or CLM option is specified.

REGWQ

performs the Ryan-Einot-Gabriel-Welsch multiple range test on all main-effect means in the MEANS statement. See the LINES option for a discussion of how the procedure displays results.

SCHEFFE

erforms Scheffé’s multiple-comparison procedure on all main-effect means in the MEANS statement. See the CLDIFF and LINES options for discussions of how the procedure displays results.

SIDAK

performs pairwise t tests on differences between means with levels adjusted according to Sidak’s inequality for all main-effect means in the MEANS statement. See the CLDIFF and LINES options for discussions of how the procedure displays results.

SMM
GT2

performs pairwise comparisons based on the studentized maximum modulus and Sidak’s uncorrelated-t inequality, yielding Hochberg’s GT2 method when sample sizes are unequal, for all main-effect means in the MEANS statement. See the CLDIFF and LINES options for discussions of how the procedure displays results.

SNK

performs the Student-Newman-Keuls multiple range test on all main-effect means in the MEANS statement. See the LINES option for discussions of how the procedure displays results.

T
LSD

performs pairwise t tests, equivalent to Fisher’s least significant difference test in the case of equal cell sizes, for all main-effect means in the MEANS statement. See the CLDIFF and LINES options for discussions of how the procedure displays results.

TUKEY

performs Tukey’s studentized range test (HSD) on all main-effect means in the MEANS statement. (When the group sizes are different, this is the Tukey-Kramer test.) See the CLDIFF and LINES options for discussions of how the procedure displays results.

WALLER

performs the Waller-Duncan k-ratio t test on all main-effect means in the MEANS statement. See the KRATIO= and HTYPE= options for information about controlling details of the test, and the LINES option for a discussion of how the procedure displays results.

WELCH

requests the variance-weighted one-way ANOVA of Welch (1951). This alternative to the usual analysis of variance for a one-way model is robust to the assumption of equal within-group variances. This option is ignored unless your MODEL statement specifies a simple one-way model.

Note that using the WELCH option merely produces one additional table consisting of Welch’s ANOVA. It does not affect all of the other tests displayed by the GLM procedure, which still require the assumption of equal variance for exact validity.

See the section Homogeneity of Variance in One-Way Models for more details on Welch’s ANOVA. Example 44.10 illustrates the use of the HOVTEST and WELCH options in the MEANS statement in testing for equal group variances and adjusting for unequal group variances in a one-way ANOVA.