FOCUS AREAS

SAS/STAT Topics

SAS/STAT Software

Power and Sample Size

Power and sample size analysis optimizes the resource usage and design of a study, improving chances of conclusive results with maximum efficiency. The standard statistical testing paradigm implicitly assumes that Type I errors (mistakenly concluding significance when there is no true effect) are more costly than Type II errors (missing a truly significant result). This may be appropriate for your situation, or the relative costs of the two types of error may be reversed. Power and sample size analysis can help you achieve your desired balance between Type I and Type II errors. With optimal designs and sample sizes, you can improve your chances of detecting effects that might otherwise have been ignored, save money and time, and perhaps minimize risks to subjects.

The SAS/STAT power and sample size procedures include the following:

GLMPOWER Procedure


Power and sample size analysis optimizes the resource usage and design of a study, improving chances of conclusive results with maximum efficiency. The GLMPOWER procedure performs prospective power and sample size analysis for linear models, with a variety of goals:

The following are highlights of the GLMPOWER procedure's features:

  • statistical analyses that are covered include Type III F tests and contrasts of fixed effects in univariate and multivariate linear models, optionally with covariates
  • for multivariate models, you can choose from the following tests:
    • Wilks' lambda
    • Hotelling-Lawley trace
    • Pillai's trace
  • for the univariate approach to repeated measures, you can choose from the following types of F tests:
    • uncorrected
    • Greenhouse-Geisser
    • Huynh-Feldt
    • Box conservative
  • supports BY group processing, which enables you to obtain separate analyses for grouped observations
  • creates a SAS data set that corresponds to any output table
  • automatically creates graphs by using ODS Graphics
For further details, see GLMPOWER Procedure

POWER Procedure


The POWER procedure performs prospective power and sample size analyses for a variety of goals, such as the following:

  • provides analysis for the following:
    • t tests, equivalence tests, and confidence intervals for means
    • tests, equivalence tests, and confidence intervals for binomial proportions
    • multiple regression
    • tests of correlation and partial correlation
    • one-way analysis of variance
    • rank tests for comparing two survival curves
    • logistic regression with binary response
    • Wilcoxon-Mann-Whitney (rank-sum) test
    • Cox proportional hazards regression
    • Farrington-Manning noniferiority tests of relative risk
  • determining the sample size required to get a significant result with adequate probability (power)
  • characterizing the power of a study to detect a meaningful effect
  • conducting what-if analyses to assess sensitivity of the power or required sample size to other factors
  • creates a SAS data set that corresponds to any output table
  • automatically creates graphs by using ODS Graphics
For further details, see POWER Procedure

SEQDESIGN Procedure


The purpose of the SEQDESIGN procedure is to design interim analyses for group sequential clinical trials. A group sequential trial provides for interim analyses before the formal completion of the trial while maintaining the specified overall Type I and Type II error probability levels.

The SEQDESIGN procedure assumes that the standardized test statistics for the null hypothesis at the stages have the joint canonical distribution with the information levels at the stages for the parameter. This implies that these test statistics are normally distributed. If the test statistic is not normally distributed, then it is assumed that the test statistic is computed from a large sample such that the statistic has an approximately normal distribution.

You can use the SEQDESIGN procedure to compute required sample sizes for commonly used hypothesis tests.

The applicable tests include tests for binomial proportions and the log-rank test for two survival distributions.

Output from the SEQDESIGN Procedure

In addition to computing the boundary values for a group sequential design, the SEQDESIGN procedure computes the following quantities:
  • maximum sample size (as a percentage of the corresponding fixed-sample size) if the trial does not stop at an interim stage
  • average sample sizes (as a percentage of the corresponding fixed-sample size) under various hypothetical references, including the null and alternative references
  • stopping probabilities at each stage under various hypothetical references to indicate how likely it is that the trial will stop at that stage
  • sample sizes required at each stage for the specified hypothesis test with nonsurvival data at each stage for the specified hypothesis test with survival data
  • numbers of events required at each stage for the specified hypothesis test with survival data

You can create more than one design with multiple DESIGN statements in the SEQDESIGN procedure and then choose the design with the most desirable features.

Group Sequential Methods

For a group sequential design, there are two possible boundaries for a one-sided test and four possible boundaries for a two-sided test. Each boundary consists of one boundary value (critical value) for each stage. The SEQDESIGN procedure provides the following methods for computing the boundary values:
  • fixed boundary shape methods, which derive boundaries with specified boundary shapes
  • Whitehead methods, which adjust boundaries derived for continuous monitoring so that they apply to discrete monitoring
  • error spending methods
For further details, see SEQDESIGN Procedure

SEQTEST Procedure


The purpose of the SEQTEST procedure is to perform interim analyses for group sequential clinical trials. A group sequential trial provides for interim analyses before the formal completion of the trial while maintaining the specified overall Type I and Type II error probability levels.

Features of the SEQTEST Procedure

At each stage, the data are analyzed with a statistical procedure such as the REG procedure, and a test statistic and its associated information level are computed. The information level is the amount of information available about the unknown parameter. For a maximum likelihood statistic, the information level is the inverse of its variance.

You then use the SEQTEST procedure to compare the test statistic with the corresponding boundary values obtained with the SEQDESIGN procedure.

If the information levels do not match the information levels specified in the design, the SEQTEST procedure modifies the boundary values to adjust for new information levels.

At the end of a trial, the parameter estimate is computed. The median unbiased estimate, confidence limits, and p-value depend on the specified sample space ordering. A sample space ordering specifies the ordering for test statistics resulting in the stopping of a trial. That is, for all the statistics in the rejection region and in acceptance region, the SEQTEST procedure provides three different sample space orderings: the stage-wise ordering uses counterclockwise ordering around the continuous region, the LR ordering uses the distance between the observed Z statistic, z, and its hypothetical value, and the MLE ordering uses the observed maximum likelihood estimate.

Note that for some clinical trials, the information levels are derived from statistics based on individuals specified in the design plan and might not reach the target information levels. Thus, instead of specifying the number of individuals in the protocol, the information levels can be specified. You can then adjust the sample sizes to achieve the information levels for the trial.

Output from the SEQTEST Procedure

In addition to the boundary values and test statistics for the group sequential trial, the SEQTEST procedure also computes the following quantities:
  • average sample sizes (as a percentage of the corresponding fixed-sample size) under various hypothetical references, including the null and alternative references
  • stopping probabilities at each stage under various hypothetical references to indicate how likely it is that the trial will stop at that stage
  • conditional power given the most recently observed statistic under specified hypothetical references
  • predictive power given the most recently observed statistic
  • repeated confidence intervals for the parameter from the observed statistic at each stage
  • parameter estimate, p-value for hypothesis testing, and median and confidence limits for the parameter at the conclusion of a sequential trial
For further details, see SEQTEST Procedure