The SURVEYSELECT Procedure

Random Number Generation

The probability sampling methods provided by PROC SURVEYSELECT use random numbers in their selection algorithms, as described in the following sections and in the references cited. PROC SURVEYSELECT uses a uniform random number function to generate streams of pseudorandom numbers from an initial starting point, or seed. You can use the SEED= option to specify the initial seed. If you do not specify the SEED= option, PROC SURVEYSELECT uses the time of day from the computer’s clock to obtain the initial seed. For information about specifying initial seeds for strata, storing stratum seeds in the output data set, and reproducing samples, see the description of the SEED= option.

Beginning in SAS/STAT 12.1, PROC SURVEYSELECT uses the Mersenne twister uniform random number generator by default. The Mersenne twister generator (Matsumoto and Nishimura 1998) has a very long period ($2^{19937} - 1$) and very good statistical properties. The algorithm is a twisted generalized feedback shift register. This is the same random number generator that the RAND function provides for the uniform distribution. For more information, see SAS Functions and CALL Routines: Reference.

In releases before SAS/STAT 12.1, PROC SURVEYSELECT uses the RANUNI random number generator, which you can now request by specifying the RANUNI option. This uniform random number generator is based on the method of Fishman and Moore (1982), which uses a prime modulus multiplicative generator with modulus $2^{31}$ and multiplier 397,204,094. This is the same uniform random number generator that the RANUNI function provides. For more information about the RANUNI function, see SAS Functions and CALL Routines: Reference.

To reproduce samples that PROC SURVEYSELECT selects in releases before SAS/STAT 12.1, you can specify the RANUNI option along with the same SEED= option value (for the same input data set and selection parameters).

When you use the RANUNI random number generator for stratified sampling, PROC SURVEYSELECT generates a single pseudorandom number stream across all strata. You can store the stratum initial seeds in the output data set by specifying the OUTSEED option, and you can use the stratum seeds to reproduce stratum samples (separately, apart from the entire sample).

When you use the Mersenne twister random number generator for stratified sampling, PROC SURVEYSELECT generates separate, independent pseudorandom number streams for the strata by default. To use a single Mersenne twister pseudorandom number stream across all strata, you can specify the STRATUMSEED=NONE option. When you specify this option, stratum initial seeds are not available in the output data set.

In SAS/STAT 14.1, PROC SURVEYSELECT uses a different method to initialize the stratum (Mersenne twister) pseudorandom number streams. To reproduce stratified samples that PROC SURVEYSELECT selects by using the Mersenne twister random number generator in releases before SAS/STAT 14.1, you can specify the STRATUMSEED=RESTORE option (along with the same SEED= option value, input data set, and selection parameters).