The next sample design for the customer satisfaction survey uses stratification by State
and also control sorting by Type
and Usage
within State
. After stratification and control sorting, customers are selected by systematic random sampling within strata. Selection
by systematic sampling, together with control sorting before selection, spreads the sample uniformly over the range of type
and usage values within each stratum (state). The following PROC SURVEYSELECT statements select a probability sample of customers
from the Customers
data set according to this design:
title1 'Customer Satisfaction Survey'; title2 'Stratified Sampling with Control Sorting'; proc surveyselect data=Customers method=sys rate=.02 seed=1234 out=SampleControl; strata State; control Type Usage; run;
The STRATA statement names the stratification variable State
. The CONTROL statement names the control variables Type
and Usage
. In the PROC SURVEYSELECT statement, the METHOD=SYS option requests systematic random sampling. The RATE= option specifies
a sampling rate of 2% for each stratum. The SEED= option specifies the initial seed for random number generation.
Figure 115.7 displays the output from PROC SURVEYSELECT, which summarizes the sample selection. A sample of 270 customers is selected
by using systematic random sampling within strata that are determined by State
. The sampling frame Customers
is sorted by control variables Type
and Usage
within strata. The type of sorting is serpentine, which is the default when SORT=NEST is not specified. For information about
serpentine sorting, see the section Sorting by CONTROL Variables. The sorted data set replaces the input data set. (To leave the input data set unsorted and store the sorted input data in
another data set, use the OUTSORT= option.) The output data set SampleControl
contains the sample of customers.
Figure 115.7: Sample Selection Summary