The SURVEYSELECT Procedure

Optimal Allocation

When you specify the ALLOC=OPTIMAL option in the STRATA statement, PROC SURVEYSELECT allocates the total sample size among the strata in proportion to stratum sizes, stratum costs, and stratum variances. You provide the stratum costs and variances by using the COST= and VAR= options, respectively.

Optimal allocation minimizes the overall variance for a specified cost, or equivalently minimizes the overall cost for a specified variance. For details, see Lohr (2010); Cochran (1977); Kish (1965). For optimal allocation, PROC SURVEYSELECT computes the proportion of the total sample size for stratum h as

\[  f_ h^{*} = \frac{N_ h S_ h}{\sqrt {C_ h}} ~  / ~  \sum _{i=1}^ H \frac{N_ i S_ i}{\sqrt {C_ i}}  \]

where $N_ h$ is the number of sampling units in stratum h, $S_ h$ is the standard deviation within stratum h, $C_ h$ is the unit cost within stratum h, and H is the total number of strata.

If you specify the total sample size n in the SAMPSIZE= option in the PROC SURVEYSELECT statement, the procedure computes the target sample size for stratum h as

\[  n_ h^{*} = f_ h^{*} \times n  \]

As described in the section Proportional Allocation, the values of $n_ h^{*}$ are converted to integer sample sizes $n_ h$ by using a rounding algorithm that requires the sum of the stratum sample sizes to equal n. The final stratum sample sizes $n_ h$ are also required to be at least 1, or at least $n_{\mi {min}}$ if you specify a minimum stratum sample size in the ALLOCMIN= option in the STRATA statement. For without-replacement selection methods, the final sample sizes cannot exceed the stratum sizes.