PROC SEQDESIGN: Sample Size Computation :: SAS/STAT(R) 9.2 User's Guide, Second Edition

The SEQDESIGN Procedure

Sample Size Computation

The SEQDESIGN procedure assumes that the data are from a multivariate normal distribution and the sequence of the standardized test statistics $\text{[math]}$ has the following canonical joint distribution:

$\text{[math]}$ is multivariate normal
$\text{[math]}$
$\text{[math]}$ , $\text{[math]}$

where $\text{[math]}$ is the total number of stages and $\text{[math]}$ is the information available at stage $\text{[math]}$ .

If the test statistic is computed from the data that are not from a normal distribution, such as a binomial distribution, then it is assumed that the test statistic is computed from a large sample such that the statistic has an approximately normal distribution.

In a typical clinical trial, the sample size required depends on the Type I error probability level $\text{[math]}$ , alternative reference $\text{[math]}$ , power $\text{[math]}$ , and variance of the response variable. Given a one-sided null hypothesis $\text{[math]}$ with an upper alternative hypothesis $\text{[math]}$ , the information required for a fixed-sample test is given by

$\text{[math]}$

The parameter $\text{[math]}$ and the subsequent alternative reference $\text{[math]}$ depend on the test specified in the clinical trial. For example, suppose you are comparing two binomial populations $\text{[math]}$ ; then $\text{[math]}$ is the difference between two proportions if the proportion difference statistic is used, and $\text{[math]}$ , the log odds ratio for the two proportions if the log odds ratio statistic is used.

If the maximum likelihood estimate $\text{[math]}$ from the likelihood function can be derived, then the asymptotic variance for $\text{[math]}$ is $\text{[math]}$ , where $\text{[math]}$ is Fisher information for $\text{[math]}$ . The resulting statistic $\text{[math]}$ corresponds to the MLE statistic scale as specified in the BOUNDARYSCALE=MLE option in the PROC SEQDESIGN statement, $\text{[math]}$ corresponds to the standardized $\text{[math]}$ scale (BOUNDARYSCALE=STDZ), and $\text{[math]}$ corresponds to the score statistic scale (BOUNDARYSCALE=SCORE).

Alternatively, if the score statistic $\text{[math]}$ is derived in a statistical procedure, it can be used as the test statistic and its asymptotic variance is given by Fisher information, $\text{[math]}$ . In this case, $\text{[math]}$ corresponds to the standardized $\text{[math]}$ scale and $\text{[math]}$ corresponds to the MLE statistic scale.

For a group sequential trial, the maximum information $\text{[math]}$ is derived in the SEQDESIGN procedure with the specified $\text{[math]}$ , $\text{[math]}$ , and $\text{[math]}$ . With the maximum information

$\text{[math]}$

the sample size required for a specified test statistic in the trial can be evaluated or estimated from the known or estimated variance of the response variable. Note that different designs might produce different maximum information levels for the same hypothesis, and this in turn might require a different number of observations for the trial.

If each observation in the data set provides one unit of information in a hypothesis testing, such as a one-sample test for the mean, the required sample size for the sequential design can be derived from the maximum information. However, for a survival analysis, an individual in the survival time data might provide only partial information because of censoring. In this case, the required number of events can be derived from the maximum information. With addition accrual information, the sample size can also be computed.

The SEQDESIGN procedure provides sample size computation for some one-sample and two-sample tests in the SAMPLESIZE statement. It also provides sample size computation for tests of a parameter in regression models such as normal regression, logistic regression, and proportional hazards regression. In addition, the procedure can also compute the required sample size or number of events from the corresponding number in the fixed-sample design.

Table 77.11 lists the options available in the SAMPLESIZE statement.

Table 77.11 SAMPLESIZE Statement Options
Option	Description
Fixed-Sample Models
INPUTNOBS	specifies sample size for fixed-sample design
INPUTNEVENTS	specifies number of events for fixed-sample design
One-Sample Models
ONESAMPLEMEAN	specifies one-sample $\text{[math]}$ test for mean
ONESAMPLEFREQ	specifies one-sample test for binomial proportion
Two-Sample Models
TWOSAMPLEMEAN	specifies two-sample $\text{[math]}$ test for mean difference
TWOSAMPLEFREQ	specifies two-sample test for binomial proportions
TWOSAMPLESURVIVAL	specifies log-rank test for two survival distributions
Regression Models
REG	specifies test for a regression parameter
LOGISTIC	specifies test for a logistic regression parameter
PHREG	specifies test for a proportional hazards regression parameter

The MODEL=INPUTNOBS and MODEL=INPUTNEVENTS options are described next, and the remaining options are described in the next three sections.

Input Sample Size for Fixed-Sample Design

The MODEL=INPUTNOBS option derives the sample size required for a group sequential trial from the sample size $\text{[math]}$ for the corresponding fixed-sample design. With the N= $\text{[math]}$ option specifying the sample size $\text{[math]}$ for a fixed-sample design, the sample size required for a group sequential trial is then computed as

$\text{[math]}$

where $\text{[math]}$ is the maximum information for the group sequential design and $\text{[math]}$ is the information for the corresponding fixed-sample design. The information ratio between $\text{[math]}$ and $\text{[math]}$ is derived in the SEQDESIGN procedure.

The SAMPLE=ONE option specifies a one-sample test, and the SAMPLE=TWO option specifies a two-sample test. For a two-sample test, the WEIGHT= option specifies the sample size allocation weights for the two groups.

Input Number of Events for Fixed-Sample Design

The MODEL=INPUTNOBS option derives the number of events required for a group sequential trial from the number of events $\text{[math]}$ for the corresponding fixed-sample design. With the D= $\text{[math]}$ option specifies the number of events $\text{[math]}$ for a fixed-sample survival analysis, the number of events required for a group sequential trial is then computed as

$\text{[math]}$

With the computed number of events $\text{[math]}$ for a group sequential survival design, the required total sample size and sample size at each stage can be derived with specifications of hazard rates, accrual rate, and accrual time.

For a study group, if the hazard rate $\text{[math]}$ is constant, corresponding to an exponential survival distribution, and the individual accrual is uniform in the accrual time $\text{[math]}$ with a constant accrual rate $\text{[math]}$ , Kim and Tsiatis (1990, pp. 83–84) show that the expected number of events by time $\text{[math]}$ is given by

$\text{[math]}$

For a one-sample design, such as a proportional hazards regression, the expected number of events by time $\text{[math]}$ is $\text{[math]}$ , where $\text{[math]}$ is the hazard rate for the group. For a two-sample design, such as a log-rank test for two survival distributions, the expected number of events by time $\text{[math]}$ is

$\text{[math]}$

where $\text{[math]}$ and $\text{[math]}$ are hazard rates in groups A and B, respectively, and $\text{[math]}$ is the ratio of the sample size allocation weights $\text{[math]}$ .

If the accrual rate $\text{[math]}$ is specified without the accrual time $\text{[math]}$ , follow-up time $\text{[math]}$ , and total study time $\text{[math]}$ , the SEQDESIGN procedure computes the minimum and maximum accrual times from the following equation, as described in Kim and Tsiatis (1990, p. 85):

$\text{[math]}$

If the accrual rate $\text{[math]}$ is specified with one of the three time parameters—the accrual time, follow-up time, and total study time—then the other two time parameters are computed in the SEQDESIGN procedure. Similarly, if the accrual rate $\text{[math]}$ is not specified, but two of the three time parameters are specified, then the accrual rate is derived in the SEQDESIGN procedure.

With the accrual rate $\text{[math]}$ and the accrual time $\text{[math]}$ , the total sample size is

$\text{[math]}$

At each stage $\text{[math]}$ , the number of events is given by

$\text{[math]}$

The corresponding time $\text{[math]}$ can be derived from the equation for the expected number of events, $\text{[math]}$ , and the resulting sample size is computed as

$\text{[math]}$

The following three sections describe examples of test statistics with their resulting information levels, which can then be used to derive the required sample size. The maximum likelihood estimators are used for all tests except to compare two survival distributions with a log-rank test, where a score statistic is used.

Top of Page