Language Reference


SAMPLE Function

SAMPLE (x <, n> <, method> <, prob> );

The SAMPLE function generates a random sample of the elements of x. The function can sample from x with replacement or without replacement. The function can sample from x with equal probability or with unequal probability.

The arguments are as follows:

x

is a matrix that specifies the sample space. That is, the sample is drawn from the elements of x.

n

specifies the number of times to sample. The argument can be a scalar or a two-element vector.

  • If this argument is omitted, then the number of elements of x is used.

  • If n is a scalar, then it represents the sample size, which is the number of independent draws from the population. This value determines the number of columns in the output matrix.

  • If n is a two-element vector, the first element represents the sample size. The second element specifies the number of samples, which is the number of rows in the output matrix. If the sampling is without replacement, then n[1] must be less than or equal to the number of elements in x.

method

is an optional argument that specifies how sampling is performed. The following are valid options:

"Replace"

specifies simple random sampling with replacement. This is the default value.

"NoReplace"

specifies simple random sampling without replacement. The elements in the samples might appear in the same order as in x.

"WOR"

specifies simple random sampling without replacement. After elements are randomly selected, their order is randomly permuted.

prob

is a vector with the same number of elements as x. The vector specifies the sampling probability for the elements of x. The SAMPLE function internally scales the elements of prob so that they sum to unity.

The SAMPLE function uses the random seed that is set by the RANDSEED function.

The prob argument specifies the probabilities that are used when sampling from x. When method is "Replace," the probabilities do not change during the sampling. However, when method is "NoReplace," the probabilities are renormalized after each selection.

For example, suppose that the element $x_ i, i=1\ldots n$ has probability $p_ i$ of being sampled, where $\Sigma _{i=1}^ n p_ i = 1$. If the element $x_1$ is selected in the first round of sampling, the remaining elements have the new probability $q_ i$ of being sampled during the second round, where $q_ i = p_ i / (\Sigma _{j=2}^ n p_ j)$ and $i=2\ldots n$.

The following statements use three different methods to choose a sample from the integers 1–5:

x = 1:5;
call randseed(12345);
s1 = sample(x);
s2 = sample(x, 5, "Replace", {0.6 0.1 0.0 0.1 0.2});
s3 = sample(x, 3, "NoReplace");
print s1, s2, s3;

Figure 25.356: Random Samples

s1
3 5 3 5 5

s2
1 5 1 1 2

s3
1 2 5