The GLMPOWER Procedure

Example 47.2 Two-Way ANOVA with Covariate

Suppose you can enhance the planned study discussed in Example 47.1 in two ways:

incorporate results from races at two different altitudes ("high" and "low")
measure the body mass index of each runner before the race

This is equivalent to adding a second fixed effect and a continuous covariate to your model.

Since lactic acid buildup is more pronounced at higher altitudes, you will include altitude as a factor in the model along with fluid, extending the one-way ANOVA to a two-way ANOVA. In doing so, you expect to lower the residual standard deviation from about 3.75 to 3.5 (in addition to generalizing the study results). You assume there is negligible interaction between fluid and altitude and plan to use a main-effects-only model. You conjecture that the mean lactic acid buildup follows Table 47.11.

Table 47.11: Mean Lactic Acid Buildup by Fluid and Altitude

	Fluid
Altitude	Water	EZD1	EZD2	LZ1	LZ2
High	36.9	35.0	31.5	30	27.1
Low	34.3	32.4	28.9	27	24.7

By including a measurement of body mass index as a covariate in the study, you hope to further reduce the error variability. The extent of this reduction in variability is commonly expressed in two alternative ways: (1) the correlation between the covariates and the response or (2) the proportional reduction in total R square incurred by the covariates. You prefer the former and guess that the correlation between body mass index and lactic acid buildup is between 0.2 and 0.3. You specify these estimates with the NCOVARIATES= and CORRXY= options in the POWER statement. The covariate is not included in the MODEL statement.

You are interested in the same four fluid comparisons as in Example 47.1, shown in Table 47.10, except this time you want to marginalize over the effect of altitude.

For each of these contrasts, you want to determine the sample size required to achieve a power of 0.9 to detect an effect with magnitude according to Table 47.11. You are not yet attempting to choose a single sample size for the study, but rather checking the range of sample sizes needed by individual contrasts. You plan to test each contrast at $\alpha$ = 0.025. You will provide twice as many runners with water as with any of the electrolytes, and you predict that you can study approximately two-thirds as many runners at high altitude than at low altitude. The resulting planned sample size weighting scheme is shown in Table 47.12. Since the scheme is only approximate, you use the NFRACTIONAL option in the POWER statement to disable the rounding of sample sizes up to integers satisfying the weights exactly.

Table 47.12: Approximate Sample Size Allocation Weights

	Fluid
Altitude	Water	EZD1	EZD2	LZ1	LZ2
High	4	2	2	2	2
Low	6	3	3	3	3

First, you create the exemplary data set to specify means and weights for the design profiles:

data Fluids2;
   input Altitude $ Fluid $ LacticAcid CellWgt;
   datalines;
         High       Water      36.9       4
         High       EZD1       35.0       2
         High       EZD2       31.5       2
         High       LZ1        30         2
         High       LZ2        27.1       2
         Low        Water      34.3       6
         Low        EZD1       32.4       3
         Low        EZD2       28.9       3
         Low        LZ1        27         3
         Low        LZ2        24.7       3
;

The variables Altitude, Fluid, and LacticAcid specify the factors and cell means in Table 47.11. The variable CellWgt contains the sample size allocation weights in Table 47.12.

Use the DATA= option in the PROC GLMPOWER statement to specify Fluids2 as the exemplary data set. The following statements perform the sample size analysis:

proc glmpower data=Fluids2;
   class Altitude Fluid;
   model LacticAcid = Altitude Fluid;
   weight CellWgt;
   contrast "Water vs. others" Fluid  -1 -1 -1 -1 4;
   contrast "EZD vs. LZ"       Fluid   1  1 -1 -1 0;
   contrast "EZD1 vs. EZD2"    Fluid   1 -1  0  0 0;
   contrast "LZ1 vs. LZ2"      Fluid   0  0  1 -1 0;
   power
      nfractional
      stddev      = 3.5
      ncovariates = 1
      corrxy      = 0.2 0.3 0
      alpha       = 0.025
      ntotal      = .
      power       = 0.9;
run;

The CLASS statement identifies Altitude and Fluid as classification variables. The MODEL statement specifies the model, and the WEIGHT statement identifies CellWgt as the weight variable. The CONTRAST statement specifies the contrasts in Table 47.10. As in Example 47.1, the order of the contrast coefficients corresponds to the formatted class levels (EZD1, EZD2, LZ1, LZ2, Water). The POWER statement specifies total sample size as the result parameter and provides values for the other analysis parameters. The NCOVARIATES= option specifies the single covariate (body mass index), and the CORRXY= option specifies the two scenarios for its correlation with lactic acid buildup (0.2 and 0.3). Output 47.2.1 displays the results.

Output 47.2.1: Sample Sizes for Two-Way ANOVA Contrasts

The GLMPOWER Procedure

Fixed Scenario Elements
Dependent Variable	LacticAcid
Weight Variable	CellWgt
Alpha	0.025
Number of Covariates	1
Std Dev Without Covariate Adjustment	3.5
Nominal Power	0.9

Computed Ceiling N Total
Index	Type	Source	Corr XY	Adj Std Dev	Test DF	Error DF	Fractional N Total	Actual Power	Ceiling N Total
1	Effect	Altitude	0.2	3.43	1	84	90.418451	0.902	91
2	Effect	Altitude	0.3	3.34	1	79	85.862649	0.901	86
3	Effect	Altitude	0.0	3.50	1	88	94.063984	0.903	95
4	Effect	Fluid	0.2	3.43	4	16	22.446173	0.912	23
5	Effect	Fluid	0.3	3.34	4	15	21.687544	0.908	22
6	Effect	Fluid	0.0	3.50	4	17	23.055716	0.919	24
7	Contrast	Water vs. others	0.2	3.43	1	15	21.720195	0.905	22
8	Contrast	Water vs. others	0.3	3.34	1	14	20.848805	0.903	21
9	Contrast	Water vs. others	0.0	3.50	1	16	22.422381	0.910	23
10	Contrast	EZD vs. LZ	0.2	3.43	1	35	41.657424	0.903	42
11	Contrast	EZD vs. LZ	0.3	3.34	1	33	39.674037	0.903	40
12	Contrast	EZD vs. LZ	0.0	3.50	1	37	43.246415	0.906	44
13	Contrast	EZD1 vs. EZD2	0.2	3.43	1	139	145.613657	0.901	146
14	Contrast	EZD1 vs. EZD2	0.3	3.34	1	132	138.173983	0.902	139
15	Contrast	EZD1 vs. EZD2	0.0	3.50	1	145	151.565917	0.901	152
16	Contrast	LZ1 vs. LZ2	0.2	3.43	1	268	274.055008	0.901	275
17	Contrast	LZ1 vs. LZ2	0.3	3.34	1	253	259.919126	0.900	260
18	Contrast	LZ1 vs. LZ2	0.0	3.50	1	279	285.363976	0.901	286

The sample sizes in Output 47.2.1 range from 21 for the comparison of water versus electrolytes (assuming a correlation of 0.3 between body mass and lactic acid buildup) to 275 for the comparison of LZ1 versus LZ2 (assuming a correlation of 0.2). PROC GLMPOWER also includes the effect tests for Altitude and Fluid. Note that the required sample sizes for this study are lower than those for the study in Example 47.1.

Note that the error standard deviation has been reduced from 3.5 to 3.43 (when correlation is 0.2) or 3.34 (when correlation is 0.3) in the approximation of the effect of the body mass index covariate. The error degrees of freedom has also been automatically adjusted, lowered by 1 (the number of covariates).

Suppose you want to plot the required sample size for the range of power values from 0.5 to 0.95. First, define the analysis by specifying the same statements as before, but add the PLOTONLY option to the PROC GLMPOWER statement to disable the nongraphical results. Next, specify the PLOT statement with X= POWER to request a plot with power on the X axis. Sample size is automatically placed on the Y axis. Use the MIN= and MAX= options in the PLOT statement to specify the power range. The following statements produce the plot:

ods graphics on;

proc glmpower data=Fluids2 plotonly;
   class Altitude Fluid;
   model LacticAcid = Altitude Fluid;
   weight CellWgt;
   contrast "Water vs. others" Fluid  -1 -1 -1 -1 4;
   contrast "EZD vs. LZ"       Fluid   1  1 -1 -1 0;
   contrast "EZD1 vs. EZD2"    Fluid   1 -1  0  0 0;
   contrast "LZ1 vs. LZ2"      Fluid   0  0  1 -1 0;
   power
      nfractional
      stddev      = 3.5
      ncovariates = 1
      corrxy      = 0.2 0.3 0
      alpha       = 0.025
      ntotal      = .
      power       = 0.9;
   plot x=power min=.5 max=.95;
run;

See Output 47.2.2 for the resulting plot.

Output 47.2.2: Plot of Sample Size versus Power for Two-Way ANOVA Contrasts

In Output 47.1.2, the line style identifies the test, and the plotting symbol identifies the scenario for the correlation between covariate and response. The plotting symbol locations identify actual computed powers; the curves are linear interpolations of these points. As in Example 47.1, the required sample size is highest for the test of LZ1 versus LZ2.

Finally, suppose you want to plot the power for the range of sample sizes you will likely consider for the study (the range of 21 to 275 that achieves 0.9 power for different comparisons). In the POWER statement, identify power as the result (POWER= .), and specify NTOTAL= 21. Specify the PLOT statement with X= N to request a plot with sample size on the X axis.

The following statements produce the plot:

proc glmpower data=Fluids2 plotonly;
   class Altitude Fluid;
   model LacticAcid = Altitude Fluid;
   weight CellWgt;
   contrast "Water vs. others" Fluid  -1 -1 -1 -1 4;
   contrast "EZD vs. LZ"       Fluid   1  1 -1 -1 0;
   contrast "EZD1 vs. EZD2"    Fluid   1 -1  0  0 0;
   contrast "LZ1 vs. LZ2"      Fluid   0  0  1 -1 0;
   power
      nfractional
      stddev      = 3.5
      ncovariates = 1
      corrxy      = 0.2 0.3 0
      alpha       = 0.025
      ntotal      = 21
      power       = .;
   plot x=n min=21 max=275;
run;

ods graphics off;

The MAX= 275 option in the PLOT statement sets the maximum sample size value. The MIN= option automatically defaults to the value of 21 from the NTOTAL= option in the POWER statement.

See Output 47.2.3 for the plot.

Output 47.2.3: Plot of Power versus Sample Size for Two-Way ANOVA Contrasts

Although Output 47.2.2 and Output 47.2.3 surface essentially the same computations for practical power ranges, they each provide a different quick visual assessment. Output 47.2.2 reveals the range of required sample sizes for powers of interest, and Output 47.2.3 reveals the range of powers achieved for sample sizes of interest.