The SURVEYSELECT Procedure

Sampford’s PPS Method

Sampford’s method (METHOD=PPS_SAMPFORD ) is an extension of Brewer’s method that selects more than two units from each stratum, with probability proportional to size and without replacement. The selection probability for unit i in stratum h is $n_ h M_{hi} / M_{h \cdot } = n_ h Z_{hi}$ . (Because selection probabilities cannot exceed 1, the relative size for each unit, $Z_{hi}$ , must not exceed $1/n_ h$ .)

Sampford’s method first selects a unit from stratum h with probability $Z_{hi}$ . Then subsequent units are selected with probability proportional to

$\lambda _{hi} = Z_{hi} ~ / ~ (1-n_ h ~ Z_{hi})$

and with replacement. If the same unit appears more than once in the sample of size $n_ h$ , then Sampford’s algorithm rejects that sample and selects a new sample. The sample is accepted if it contains $n_ h$ distinct units.

If you specify the JTPROBS option, PROC SURVEYSELECT computes the joint selection probabilities for all pairs of selected units in each stratum. The joint selection probability for units i and j in stratum h is

$P_{h(ij)} = K_ h ~ \lambda _{hi} ~ \lambda _{hj} ~ \sum _{t=2}^{n_ h} \Bigl ( ~ \left[ t - n_ h ~ (Z_{hi} + Z_{hj}) \right] ~ L_{h,(n_\mi {h}-t)}(\bar{ij}) \Bigr ) ~ / ~ n_ h^{t-2}$

where

$K_ h = 1 ~ / ~ \sum _{t=1}^{n_\mi {h}} \left( t~ L_{h,(n_\mi {h}-t)} ~ / ~ n_ h^{t} \right)$

$L_{h,m} = \sum _{S_ h(m)} \lambda _{h i_1} ~ \lambda _{h i_2} ~ \cdots ~ \lambda _{h i_ m}$

and $S_ h(m)$ denotes all possible samples of size m, for $m = 1, 2, \ldots , N_ h$ . The sum $L_{h,m}(\bar{ij})$ is defined similarly to $L_{h,m}$ but sums over all possible samples of size m that do not include units i and j. For more information, see Cochran (1977, pp. 262–263) and Sampford (1967).