The SEQDESIGN Procedure

Applicable One-Sample Tests and Sample Size Computation

The SEQDESIGN procedure provides sample size computation for two one-sample tests: normal mean and binomial proportion. The required sample size depends on the variance of the response variable—that is, the sample proportion for a binomial proportion test.

In a typical clinical trial, a hypothesis is designed to reject, not accept, the null hypothesis to show the evidence for the alternative hypothesis. Thus, in most cases, the proportion under the alternative hypothesis is used to derive the required sample size. For a test of the binomial proportion, the REF=NULLPROP and REF=PROP options use proportions under the null and alternative hypotheses, respectively.

Test for a Normal Mean

The MODEL=ONESAMPLEMEAN option in the SAMPLESIZE statement derives the sample size required to test a normal mean by using the sample mean statistic for the null hypothesis $\mu = \mu _{0}$. At stage k, the sample mean is computed as

\[  {\overline{y}}_{k} = \frac{1}{N_{k}} \sum _{j=1}^{N_{k}} {{y}_{kj}}  \]

where ${y}_{kj}$ is the value of the jth observation available in the kth stage and $N_{k}$ is the cumulative sample size at stage k.

An equivalent hypothesis is $H_{0}: \theta = 0$, where $\theta = \mu - \mu _{0}$.

The MLE statistic for $\theta $,

\[  \hat{\theta }_ k= {\overline{y}}_{k} - \mu _{0} \sim N \left( \,  \theta , \,  {I_{k}}^{-1} \right)  \]

where the information

\[  I_{k} = \frac{1}{\mr {Var}(\hat{\theta })} = \frac{1}{\mr {Var}({\overline{y}}_{k})} = \frac{N_{k}}{{\sigma }^{2}}  \]

is the inverse of the variance.

That is, the standardized statistic

\[  Z_{k} = \hat{\theta }_ k \sqrt {I_ k} = ({\overline{y}}_{k} - \mu _{0}) \sqrt {I_{k}} \sim N \left( \,  \theta \sqrt {I_{k}}, \,  1 \right)  \]

Thus, to test the hypothesis $H_{0}: \theta =0$ against a two-sided alternative $H_{1}: \theta =\theta _1$, $H_{0}$ is rejected at stage k if the statistic $Z_{k}$ is less than or equal to the lower $\alpha $ boundary value or if $Z_{k}$ is greater than or equal to the upper $\alpha $ boundary value at stage k.

If the variance ${\sigma }^{2}$ is unknown, the sample variance can be used if it is assumed that the sample variance is computed from a large sample such that the test statistic has an approximately normal distribution.

The maximum information is needed to derive the required sample size. If the maximum information is not specified or derived with the ALTREF= option in the procedure, the MEAN=$\theta _1$ option in the SAMPLESIZE statement is used to specify the alternative reference and thus to derive the maximum information.

In the SEQDESIGN procedure, the computed total sample size

\[  N_{K} = {\sigma }^{2} \, \,  I_{X}  \]

where $I_{X}$ is the maximum information and ${\sigma }$ is the specified standard deviation. With an available maximum information, you can specify the MODEL=ONESAMPLEMEAN( STDDEV= ${\sigma }$) option in the SAMPLESIZE statement to compute the required total sample size and individual sample size at each stage. A procedure such as PROC MEANS can be used to derive a one-sample Z test for a normal mean.

Test for a Binomial Proportion

The MODEL=ONESAMPLEFREQ option in the SAMPLESIZE statement derives the sample size required to test a binomial proportion by using the null hypothesis $p= p_{0}$, where p is the proportion of a binomial population. At stage k, the MLE for p is computed as

\[  \hat{p}_{k} = \frac{1}{N_{k}} \sum _{j=1}^{N_{k}} {{y}_{kj}}  \]

where ${y}_{kj}$ is the value of the jth observation available in the kth stage and $N_{k}$ is the cumulative sample size at stage k.

An equivalent hypothesis is $H_{0}: \theta = 0$, where $\theta = p - p_{0}$. If $p_{0}$ is not close to 0 or 1, then for a large sample, $\hat{\theta }_ k= \hat{p}_{k} - p_{0}$ has an approximately normal distribution

\[  \hat{\theta }_ k \sim N \left( \,  \theta , \,  I^{-1}_{k} \right)  \]

where the information $I_{k}= ( p \,  (1-p) \,  / \,  N_{k} )^{-1}$ is the inverse of the variance $\mr {Var}(\hat{\theta })$.

Then the standardized statistic

\[  Z_{k}= \hat{\theta }_ k \sqrt {I_{k}} \sim N \left( \,  \theta \sqrt {I_{k}}, \,  1 \right)  \]

In practice, the estimated sample proportion $\hat{p}$ at stage k can be used to derive the information $I_{k}$ and test statistic $Z_{k}$. Thus, to test the hypothesis $H_{0}$ against an upper alternative $H_{1}: \theta = \theta _1 > 0$, $H_{0}$ is rejected at stage k if the statistic $Z_{k}$ is greater than or equal to the upper $\alpha $ boundary at stage k.

The maximum information $I_{X}$ is needed to derive the required sample size. If the maximum information is not specified or derived with the ALTREF= option in the procedure, the PROP= option in the SAMPLESIZE statement is used to specify the alternative reference and to derive the maximum information for the sample size calculation.

It is assumed that the sample size is sufficiently large such that the test statistic has an approximately normal distribution. With the hypotheses $H_{0}: p = p_{0}$ and $H_{1}: p = p_{1}$, the SEQDESIGN procedure derives the total sample size

\[  N_{X} = p^{*} \,  (1-p^{*}) \,  I_{X}  \]

where $p^{*}= p_{0}$ if REF=NULLPROP is specified. Otherwise, $p^{*}= p_{1}$.

If the PROP= option in the SAMPLESIZE statement is not specified, then the alternative reference $\theta _1$ derived in the SEQDESIGN procedure is used to compute $p_{1}= p_{0} + \theta _1$.

The ALTREF= option in the PROC statement can be used to specify $\theta _{1}$. Otherwise, the PROP= option in the SAMPLESIZE statement must be specified.

For example, with $H_{0}: p= 0.5$, $H_{1}: p= 0.6$, and REF=PROP (which is the default),

\[  N_{K} = p^{*} (1-p^{*}) \,  I_{X} = (0.6 \times 0.4) \,  I_{X} = 0.24 \,  I_{X}  \]

You can specify the MODEL=ONESAMPLEFREQ option in the SAMPLESIZE statement to compute the required total sample size and individual sample size at each stage. A procedure such as PROC GENMOD with the default DIST=NORMAL option in the MODEL statement can be used to derive the Z test for a binomial proportion.