The SURVEYSELECT Procedure

Proportional Allocation

When you specify the ALLOC=PROPORTIONAL option in the STRATA statement, PROC SURVEYSELECT allocates the total sample size among the strata in proportion to the stratum sizes, where the stratum size is the number of sampling units in the stratum. The allocation proportion of the total sample size for stratum h is

\[ f_ h^{*} = N_ h / N \]

where $N_ h$ is the number of sampling units in stratum h and N is the total number of sampling units for all strata. If you specify the total sample size n in the SAMPSIZE= option in the PROC SURVEYSELECT statement, the procedure computes the target sample size for stratum h as

\[ n_ h^{*} = f_ h^{*} \times n \]

The target sample size values, $n_ h^{*}$, might not be integers, but the stratum sample sizes are required to be integers. PROC SURVEYSELECT uses a rounding algorithm to convert the $n_ h^{*}$ to integer values $n_ h$ and maintain the requested total sample size n. The rounding algorithm includes the restriction that all values of $n_ h$ must be at least 1, so that at least one unit is selected from each stratum. If you specify a minimum stratum sample size $n_{\mi{min}}$ in the ALLOCMIN= option in the STRATA statement, then all values of $n_ h$ are required to be at least $n_{\mi{min}}$. For without-replacement selection methods, PROC SURVEYSELECT also requires that each stratum sample size must not exceed the total number of sampling units in the stratum, $n_ h \leq N_ h$. If a target stratum sample size exceeds the number of units in the stratum, PROC SURVEYSELECT allocates the maximum number of units, $N_ h$, to the stratum, and then allocates the remaining total sample size proportionally among the remaining strata.

PROC SURVEYSELECT provides the target allocation proportions $f_ h^{*}$ in the output data set variable AllocProportion. The variable ActualProportion contains the actual proportions for the allocated sample sizes $n_ h$. For stratum h, the actual proportion is computed as

\[ f_ h = n_ h / n \]

where $n_ h$ is the allocated sample size for stratum h and n is the total sample size. The actual proportions $f_ h$ can differ from the target allocation proportions $f_ h^{*}$ because of rounding, the requirement that $n_ h \geq 1$ (or $n_ h \geq n_{\mi{min}}$), and the requirement that $n_ h \leq N_ h$ for without-replacement selection methods.