Random Number Generation

The probability sampling methods provided by PROC SURVEYSELECT use random numbers in their selection algorithms, as described in the following sections and in the references cited. PROC SURVEYSELECT uses a uniform random number function to generate streams of pseudo-random numbers from an initial starting point, or seed. You can use the SEED= option to specify the initial seed. If you do not specify the SEED= option, PROC SURVEYSELECT uses the time of day from the computer’s clock to obtain the initial seed. For information about specifying initial seeds for strata, storing stratum seeds in the output data set, and reproducing samples, see the description of the SEED= option.

Beginning in SAS/STAT 12.1, PROC SURVEYSELECT uses the Mersenne-Twister random number generator by default. The Mersenne-Twister generator (Matsumoto and Nishimura, 1998) has a very long period ($2^{19937} - 1$) and very good statistical properties. The algorithm is a twisted generalized feedback shift register. This is the same random number generator that the RAND function provides for the uniform distribution. For more information, see SAS Functions and CALL Routines: Reference.

In previous releases, PROC SURVEYSELECT uses the RANUNI random number generator, which you can now request by specifying the RANUNI option. This uniform random number generator is based on the method of Fishman and Moore (1982), which uses a prime modulus multiplicative generator with modulus $2^{31}$ and multiplier 397,204,094. This is the same uniform random number generator that the RANUNI function provides. For more information about the RANUNI function, see SAS Functions and CALL Routines: Reference.

To reproduce samples that PROC SURVEYSELECT selects in releases before SAS/STAT 12.1, you can use the RANUNI option with the SEED= option (for the same input data set and selection parameters).