The SURVEYSELECT Procedure

Example 99.4 Proportional Allocation

This example uses the Customers data set from the section Getting Started: SURVEYSELECT Procedure. The data set Customers contains an Internet service provider’s current subscribers, and the service provider wants to select a sample from this population for a customer satisfaction survey. This example illustrates proportional allocation, which allocates the total sample size among the strata in proportion to the strata sizes.

The section Getting Started: SURVEYSELECT Procedure gives an example of stratified sampling, where the list of customers is stratified by State and Type. Figure 99.4 displays the strata in a table of State by Type for the 13,471 customers. There are four states and two levels of Type, forming a total of eight strata. A sample of 15 customers was selected from each stratum by using the following PROC SURVEYSELECT statements:

title1 'Customer Satisfaction Survey';
title2 'Stratified Sampling';
proc surveyselect data=Customers method=srs n=15
                  seed=1953 out=SampleStrata;
   strata State Type;
run;

The STRATA statement names the stratification variables State and Type. In the PROC SURVEYSELECT statement, the N= option specifies a sample size of 15 customers for each stratum.

Instead of specifying the number of customers to select from each stratum, you can specify the total sample size and request allocation of the total sample size among the strata. The following PROC SURVEYSELECT statements request proportional allocation, which allocates the total sample size in proportion to the stratum sizes:

title1 'Customer Satisfaction Survey';
title2 'Proportional Allocation';
proc surveyselect data=Customers n=1000
                  out=SampleSizes;
   strata State Type / alloc=prop nosample;
run;

The STRATA statement names the stratification variables State and Type. In the STRATA statement, the ALLOC=PROP option requests proportional allocation. The NOSAMPLE option requests that no sample be selected after the procedure computes the sample size allocation. In the PROC SURVEYSELECT statement, the N= option specifies a total sample size of 1000 customers to be allocated among the strata.

Output 99.4.1 displays the output from PROC SURVEYSELECT, which summarizes the sample allocation. The total sample size of 1000 is allocated among the eight strata by using proportional allocation. The allocated sample sizes are stored in the SAS data set SampleSizes.

Output 99.4.1: Proportional Allocation Summary

Customer Satisfaction Survey
Proportional Allocation

The SURVEYSELECT Procedure

Allocation Proportional
Strata Variables State
  Type

Input Data Set CUSTOMERS
Number of Strata 8
Total Sample Size 1000
Allocation Output Data Set SAMPLESIZES


The following PROC PRINT statements display the allocation output data set SampleSizes, which is shown in Output 99.4.2:

title1 'Customer Satisfaction Survey';
title2 'Proportional Allocation';
proc print data=SampleSizes;
run;

Output 99.4.2: Stratum Sample Sizes

Customer Satisfaction Survey
Proportional Allocation

Obs State Type Total AllocProportion SampleSize ActualProportion
1 AL New 1238 0.09190 92 0.092
2 AL Old 706 0.05241 52 0.052
3 FL New 2170 0.16109 161 0.161
4 FL Old 1370 0.10170 102 0.102
5 GA New 3488 0.25893 259 0.259
6 GA Old 1940 0.14401 144 0.144
7 SC New 1684 0.12501 125 0.125
8 SC Old 875 0.06495 65 0.065


The output data set SampleSizes includes one observation for each of the eight strata, which are identified by the stratification variables State and Type. The variable Total contains the number of sampling units in the stratum, and the variable AllocProportion contains the proportion of the total sample size to allocate to the stratum. The variable SampleSize contains the allocated stratum sample size. For the first stratum (State='AL' and Type='New'), the total number of sampling units is 1238 customers, the allocation proportion is 0.09190, and the allocated sample size is 92 customers. The sum of the allocated sample sizes equals the requested total sample size of 1000 customers.

The output data set also includes the variable ActualProportion, which contains actual stratum proportions of the total sample size. The actual proportion for a stratum equals the stratum sample size divided by the total sample size. For the first stratum (State='AL' and Type='New'), the actual proportion is 0.092, while the allocation proportion is 0.09190. The target sample sizes computed from the allocation proportions are often not integers, and PROC SURVEYSELECT uses a rounding algorithm to obtain integer sample sizes and maintain the requested total sample size. Due to rounding and other restrictions, the actual proportions can differ from the target allocation proportions. See the section Sample Size Allocation for details.

If you want to use the allocated sample sizes in a later invocation of PROC SURVEYSELECT, you can name the allocation data set in the N=SAS-data-set option, as shown in the following PROC SURVEYSELECT statements:

title1 'Customer Satisfaction Survey';
title2 'Stratified Sampling';
proc surveyselect data=Customers method=srs n=SampleSizes
                  seed=1953 out=SampleStrata;
   strata State Type;
run;