The SURVEYSELECT Procedure

Example 115.4 Proportional Allocation

This example uses the `Customers` data set from the section Getting Started: SURVEYSELECT Procedure. The data set `Customers` contains an Internet service provider’s current subscribers, and the service provider wants to select a sample from this population for a customer satisfaction survey. This example illustrates proportional allocation, which allocates the total sample size among the strata in proportion to the strata sizes.

The section Getting Started: SURVEYSELECT Procedure gives an example of stratified sampling, where the list of customers is stratified by `State` and `Type`. Figure 115.4 displays the strata in a table of `State` by `Type` for the 13,471 customers. There are four states and two levels of `Type`, forming a total of eight strata. A sample of 15 customers was selected from each stratum by using the following PROC SURVEYSELECT statements:

```title1 'Customer Satisfaction Survey';
title2 'Stratified Sampling';
proc surveyselect data=Customers method=srs n=15
seed=1953 out=SampleStrata;
strata State Type;
run;
```

The STRATA statement names the stratification variables `State` and `Type`. In the PROC SURVEYSELECT statement, the N= option specifies a sample size of 15 customers in each stratum.

Instead of specifying the number of customers to select from each stratum, you can specify the total sample size and request allocation of the total sample size among the strata. The following PROC SURVEYSELECT statements request proportional allocation, which allocates the total sample size in proportion to the stratum sizes:

```title1 'Customer Satisfaction Survey';
title2 'Proportional Allocation';
proc surveyselect data=Customers n=1000
out=SampleSizes;
strata State Type / alloc=prop nosample;
run;
```

The STRATA statement names the stratification variables `State` and `Type`. In the STRATA statement, the ALLOC=PROP option requests proportional allocation. The NOSAMPLE option requests that no sample be selected after the procedure computes the sample size allocation. In the PROC SURVEYSELECT statement, the N= option specifies a total sample size of 1000 customers to be allocated among the strata.

Output 115.4.1 displays the output from PROC SURVEYSELECT, which summarizes the sample allocation. The total sample size of 1000 is allocated among the eight strata by using proportional allocation. The allocated sample sizes are stored in the SAS data set `SampleSizes`.

Output 115.4.1: Proportional Allocation Summary

 Customer Satisfaction Survey Proportional Allocation

The SURVEYSELECT Procedure

Allocation Proportional State Type

Input Data Set CUSTOMERS 8 1000 SAMPLESIZES

The following PROC PRINT statements display the allocation output data set `SampleSizes`, which is shown in Output 115.4.2:

```title1 'Customer Satisfaction Survey';
title2 'Proportional Allocation';
proc print data=SampleSizes;
run;
```

Output 115.4.2: Stratum Sample Sizes

 Customer Satisfaction Survey Proportional Allocation

Obs State Type Total AllocProportion SampleSize ActualProportion
1 AL New 1238 0.09190 92 0.092
2 AL Old 706 0.05241 52 0.052
3 FL New 2170 0.16109 161 0.161
4 FL Old 1370 0.10170 102 0.102
5 GA New 3488 0.25893 259 0.259
6 GA Old 1940 0.14401 144 0.144
7 SC New 1684 0.12501 125 0.125
8 SC Old 875 0.06495 65 0.065

The output data set `SampleSizes` includes one observation for each of the eight strata, which are identified by the stratification variables `State` and `Type`. The variable `Total` contains the number of sampling units in the stratum, and the variable `AllocProportion` contains the proportion of the total sample size to allocate to the stratum. The variable `SampleSize` contains the allocated stratum sample size. For the first stratum (`State`='AL' and `Type`='New'), the total number of sampling units is 1238 customers, the allocation proportion is 0.09190, and the allocated sample size is 92 customers. The sum of the allocated sample sizes equals the requested total sample size of 1000 customers.

The output data set also includes the variable `ActualProportion`, which contains actual stratum proportions of the total sample size. The actual proportion for a stratum is the stratum sample size divided by the total sample size. For the first stratum (`State`='AL' and `Type`='New'), the actual proportion is 0.092, whereas the allocation proportion is 0.09190. The target sample sizes computed from the allocation proportions are often not integers, and PROC SURVEYSELECT uses a rounding algorithm to obtain integer sample sizes and maintain the requested total sample size. Because of rounding and other restrictions, the actual proportions can differ from the target allocation proportions. For more information, see the section Sample Size Allocation.

To use the allocated sample sizes in a later invocation of PROC SURVEYSELECT, you can name the allocation data set in the N=SAS-data-set option, as shown in the following PROC SURVEYSELECT statements:

```title1 'Customer Satisfaction Survey';
title2 'Stratified Sampling';
proc surveyselect data=Customers method=srs n=SampleSizes
seed=1953 out=SampleStrata;
strata State Type;
run;
```