RAND Function

Generates random numbers from a distribution that you specify.

Category: Random Number

Syntax

RAND (dist, parm-1,…,parm-k)

Required Arguments

dist

is a character constant, variable, or expression that identifies the distribution. Valid distributions are as follows:

Distribution
Argument
Bernoulli
BERNOULLI
Beta
BETA
Binomial
BINOMIAL
Cauchy
CAUCHY
Chi-Square
CHISQUARE
Erlang
ERLANG
Exponential
EXPONENTIAL
F
F
Gamma
GAMMA
Geometric
GEOMETRIC
Hypergeometric
HYPERGEOMETRIC
Lognormal
LOGNORMAL
Negative Binomial
NEGBINOMIAL
Normal
NORMAL|GAUSSIAN
Poisson
POISSON
T
T
Tabled
TABLE
Triangular
TRIANGLE
Uniform
UNIFORM
Weibull
WEIBULL
Note: Except for T and F, you can minimally identify any distribution by its first four characters.

parm-1,…,parm-k

are shape, location, or scale parameters appropriate for the specific distribution.

See Details

Details

Generating Random Numbers

The RAND function generates random numbers from various continuous and discrete distributions. Wherever possible, the simplest form of the distribution is used.
The RAND function uses the Mersenne-Twister random number generator (RNG) that was developed by Matsumoto and Nishimura (1998). The random number generator has a very long period (219937 – 1) and very good statistical properties. The period is a Mersenne prime, which contributes to the naming of the RNG. The algorithm is a twisted generalized feedback shift register (TGFSR) that explains the latter part of the name. The TGFSR gives the RNG a very high order of equidistribution (623-dimensional with 32-bit accuracy), which means that there is a very small correlation between successive vectors of 623 pseudo-random numbers.
The RAND function is started with a single seed. However, the state of the process cannot be captured by a single seed. You cannot stop and restart the generator from its stopping point.

Reproducing a Random Number Stream

If you want to create reproducible streams of random numbers, then use the CALL STREAMINIT routine to specify a seed value for random number generation. Use the CALL STREAMINIT routine once per DATA step before any invocation of the RAND function. If you omit the call to the CALL STREAMINIT routine (or if you specify a non-positive seed value in the CALL STREAMINIT routine), then RAND uses a call to the system clock to seed itself. For more information, see CALL STREAMINIT Creating a Reproducible Stream of Random Numbers.

Duplicate Values in the Mersenne-Twister RNG Algorithm

The Mersenne-Twister RNG algorithm has an extremely long period, but this does not imply that large random samples are devoid of duplicate values. The RAND function returns at most 232 distinct values. In a random uniform sample of size 105, the chance of drawing at least one duplicate is greater than 50%. The expected number of duplicates in a random uniform sample of size M is approximately M2/233 when M is much less than 232. For example, you should expect about 115 duplicates in a random uniform sample of size M=106. These results are consequences of the famous “birthday matching problem” in probability theory.

Bernoulli Distribution

x = RAND('BERNOULLI',p)
Arguments

x

is an observation from the distribution with the following probability density function:

f ( x ) = { 1 p = 0 , x = 0 p x ( 1 - p ) 1 - x 0 < p < 1 , x = 0 , 1 1 p = 1 , x = 1
Range x = 0, 1

p

is a numeric probability of success.

Range 0 ≤ p ≤ 1

Beta Distribution

x = RAND('BETA',a,b)
Arguments

x

is an observation from the distribution with the following probability density function:

f ( x ) = Γ ( a + b ) Γ ( a ) Γ ( b ) x a - 1 ( 1 - x ) b - 1
Range 0 < x < 1

a

is a numeric shape parameter.

Range a > 0

b

is a numeric shape parameter.

Range b > 0

Binomial Distribution

x = RAND('BINOMIAL',p,n)
Arguments

x

is an integer observation from the distribution with the following probability density function:

f ( x ) = { 1 p = 0 , x = 0 ( n x ) p x ( 1 - p ) n - x 0 < p < 1 , x = 0 , . . . , n 1 p = 1 , x = n
Range x = 0, 1, ..., n

p

is a numeric probability of success.

Range 0 ≤ p ≤ 1

n

is an integer parameter that counts the number of independent Bernoulli trials.

Range n = 1, 2, ...

Cauchy Distribution

x = RAND('CAUCHY')
Arguments

x

is an observation from the distribution with the following probability density function:

f ( x ) = 1 π ( 1 + x 2 )
Range –∞ < x < ∞

Chi-Square Distribution

x = RAND('CHISQUARE',df)
Arguments

x

is an observation from the distribution with the following probability density function:

f ( x ) = 2 d f / 2 Γ ( d f 2 ) x d f / 2 1 e x / 2
Range x > 0

df

is a numeric degrees of freedom parameter.

Range df > 0

Erlang Distribution

x = RAND('ERLANG',a)
Arguments

x

is an observation from the distribution with the following probability density function:

f ( x ) = 1 Γ ( a ) x a - 1 e - x
Range x > 0

a

is an integer numeric shape parameter.

Range a = 1, 2, ...

Exponential Distribution

x = RAND('EXPONENTIAL')
Arguments

x

is an observation from the distribution with the following probability density function:

f ( x ) = e - x
Range x > 0

F Distribution

x = RAND('F',n, d)
Arguments

x

is an observation from the distribution with the following probability density function:

f ( x ) = Γ ( n + d 2 ) Γ ( n 2 ) Γ ( d 2 ) n n / 2 d d / 2 x n / 2 1 ( d + n x ) ( n + d ) / 2
Range x > 0

n

is a numeric numerator degrees of freedom parameter.

Range n > 0

d

is a numeric denominator degrees of freedom parameter.

Range d > 0

Gamma Distribution

x = RAND('GAMMA',a)
Arguments

x

is an observation from the distribution with the following probability density function:

f ( x ) = 1 Γ ( a ) x a - 1 e - x
Range x > 0

a

is a numeric shape parameter.

Range a > 0

Geometric Distribution

x = RAND('GEOMETRIC',p)
Arguments

x

is an integer count that denotes the number of trials that are needed to obtain one success. X is an integer observation from the distribution with the following probability density function:

f ( x ) = { ( 1 - p ) x - 1 p 0 < p < 1 , x = 1 , 2 , . . . 1 p = 1 , x = 1
Range x = 1, 2, …

p

is a numeric probability of success.

Range 0 < p ≤ 1

Hypergeometric Distribution

x = RAND('HYPER',N,R,n)
Arguments

x

is an integer observation from the distribution with the following probability density function:

f ( x ) = ( R x ) ( N - R n - x ) ( N n )
Range x = max(0, (n – (NR))), ..., min(n, R)

N

is an integer population size parameter.

Range N = 1, 2, ...

R

is an integer number of items in the category of interest.

Range R = 0, 1, ..., N

n

is an integer sample size parameter.

Range n = 1, 2, ..., N
The hypergeometric distribution is a mathematical formalization of an experiment in which you draw n balls from an urn that contains N balls, R of which are red. The hypergeometric distribution is the distribution of the number of red balls in the sample of n.

Lognormal Distribution

x = RAND('LOGNORMAL')
Arguments

x

is an observation from the distribution with the following probability density function:

f ( x ) = e - ln 2 ( x ) / 2 x 2 π
Range x > 0

Negative Binomial Distribution

x = RAND('NEGBINOMIAL',p,k)
Arguments

x

is an integer observation from the distribution with the following probability density function:

f ( x ) = { ( x + k - 1 k - 1 ) ( 1 - p ) x p k 0 < p < 1 , x = 0 , 1 , . . . 1 p = 1 , x = 0
Range x = 0, 1, ...

k

is an integer parameter that is the number of successes. However, non-integer k values are allowed as well.

Range k = 1, 2, ...

p

is a numeric probability of success.

Range 0 < p ≤ 1
The negative binomial distribution is the distribution of the number of failures before k successes occur in sequential independent trials, all with the same probability of success, p.

Normal Distribution

x = RAND('NORMAL',<,θ,λ> )
Arguments

x

is an observation from the normal distribution with a mean of θ and a standard deviation of λ that has the following probability density function:

f ( x ) = 1 λ 2 π e x p ( - ( x - θ ) 2 2 λ 2 )
Range –∞ < x < ∞

θ

is the mean parameter.

Default 0

λ

is the standard deviation parameter.

Default 1
Range λ > 0

Poisson Distribution

x = RAND('POISSON',m)
Arguments

x

is an integer observation from the distribution with the following probability density function:

f ( x ) = m x e - m x !
Range x = 0, 1, ...

m

is a numeric mean parameter.

Range m > 0

T Distribution

x = RAND('T',df)
Arguments

x

is an observation from the distribution with the following probability density function:

f ( x ) = Γ ( d f + 1 2 ) d f π Γ ( d f 2 ) ( 1 + x 2 d f ) - d f + 1 2
Range –∞ < x < ∞

df

is a numeric degrees of freedom parameter.

Range df > 0

Tabled Distribution

x = RAND('TABLE',p1,p2, ...)
Arguments

x

is an integer observation from one of the following distributions:

If Σ i = 1 n p i < 1 , then x is an observation from this probability density function:
f ( i ) = p i , i = 1 , 2 , , n a n d f ( n + 1 ) = 1 - Σ i = 1 n p i
If for some index Σ i = 1 n p i 1 , then x is an observation from this probability density function:
f ( i ) = p i , i = 1 , 2 , , n - 1 a n d f ( n ) = 1 - Σ i = 1 n - 1 p i

p1, p2, ...

are numeric probability values.

Range 0 ≤ p1, p2, ... ≤ 1
Restriction The maximum number of probability parameters depends on your operating environment, but the maximum number of parameters is at least 32,767.
The tabled distribution takes on the values 1, 2, ..., n with specified probabilities.
Note: By using the FORMAT statement, you can map the set {1, 2, ..., n} to any set of n or fewer elements.

Triangular Distribution

x = RAND('TRIANGLE',h)
Arguments

x

is an observation from the distribution with the following probability density function:

f ( x ) = { 2 x h 0 x h 2 ( 1 - x ) 1 - h h < x 1
In this equation, 0 ≤ h ≤ 1.
Range 0 ≤ x ≤ 1
Note The distribution can be easily shifted and scaled.

h

is the horizontal location of the peak of the triangle.

Range 0 ≤ h ≤ 1

Uniform Distribution

x = RAND('UNIFORM')
Arguments

x

is an observation from the distribution with the following probability density function:

f ( x ) = 1
Range 0 < x < 1
The uniform random number generator that the RAND function uses is the Mersenne-Twister (Matsumoto and Nishimura 1998). This generator has a period of 2 1 9 9 3 7 - 1 and 623-dimensional equidistribution up to 32-bit accuracy. This algorithm underlies the generators for the other available distributions in the RAND function.

Weibull Distribution

x = RAND('WEIBULL',a,b)
Arguments

x

is an observation from the distribution with the following probability density function:

f ( x ) = a b a x a - 1 e - ( x b ) a
Range x ≥ 0

a

is a numeric shape parameter.

Range a > 0

b

is a numeric scale parameter.

Range b > 0

Example

The following SAS statements produce these results.
SAS Statement
Result
x=rand('BERN',.75);
0
x=rand('BETA',3,0.1);
.99920
x=rand('BINOM',0.75,10);
10
x=rand('CAUCHY');
-1.41525
x=rand('CHISQ',22);
25.8526
x=rand('ERLANG', 7);
7.67039
x=rand('EXPO');
1.48847
x=rand('F',12,322);
1.99647
x=rand('GAMMA',7.25);
6.59588
x=rand('GEOM',0.02);
43
x=rand('HYPER',10,3,5);
1
x=rand('LOGN');
0.66522
x=rand('NEGB',0.8,5);
33
x=rand('NORMAL');
1.03507
x=rand('POISSON',6.1);
6
x=rand('T',4);
2.44646
x=rand('TABLE',.2,.5);
2
x=rand('TRIANGLE',0.7);
.63811
x=rand('UNIFORM');
.96234
x=rand('WEIB',0.25,2.1);
6.55778

See Also

CALL Routines:

References

Fishman, G. S. 1996. Monte Carlo: Concepts, Algorithms, and Applications. New York, USA: Springer-Verlag.
Fushimi, M., and S. Tezuka. “The k-Distribution of Generalized Feedback Shift Register Pseudorandom Numbers.” 1983. Communications of the ACM 26: 516–523.
Gentle, J. E. 1998. Random Number Generation and Monte Carlo Methods. New York, USA: Springer-Verlag.
Lewis, T. G., and W. H. Payne. “Generalized Feedback Shift Register Pseudorandom Number Algorithm.” 1973. Journal of the ACM 20: 456-468.
Matsumoto, M., and Y. Kurita. “Twisted GFSR Generators.” 1992. ACM Transactions on Modeling and Computer Simulation 2: 179–194.
Matsumoto, M., and Y. Kurita. “Twisted GFSR Generators II.” 1994. ACM Transactions on Modeling and Computer Simulation 4: 254–266.
Matsumoto, M., and T. Nishimura. “Mersenne Twister: A 623–Dimensionally Equidistributed Uniform Pseudo-Random Number Generator.” 1998. ACM Transactions on Modeling and Computer Simulation 8: 3–30.
Ripley, B. D. 1987. Stochastic Simulation. New York, USA: Wiley.
Robert, C. P., and G. Casella. 1999. Monte Carlo Statistical Methods. New York, USA: Springer-Verlag.
Ross, S. M. 1997. Simulation. San Diego, USA: Academic Press.