The SURVEYREG Procedure

Notation

For a stratified clustered sample design, observations are represented by an $n \times (p+2)$ matrix

$(\mb {w, y, X}) = (w_{hij}, y_{hij}, \mb {x}_{hij})$

where

$\mb {w}$ denotes the sampling weight vector
$\mb {y}$ denotes the dependent variable
$\mb {X}$ denotes the $n\times p$ design matrix. (When an effect contains only classification variables, the columns of $\mb {X}$ that correspond this effect contain only 0s and 1s; no reparameterization is made.)
$h=1, 2, \ldots , H$ is the stratum index
$i=1, 2, \ldots , n_ h$ is the cluster index within stratum h
$j=1, 2, \ldots , m_{hi}$ is the unit index within cluster i of stratum h
p is the total number of parameters (including an intercept if the INTERCEPT effect is included in the MODEL statement)
$n=\sum _{h=1}^ H \sum _{i=1}^{n_ h} {m_{hi}}$ is the total number of observations in the sample

Also, denotes the sampling rate for stratum h. You can use the TOTAL= or RATE= option to input population totals or sampling rates. See the section Specification of Population Totals and Sampling Rates for details. If you input stratum totals, PROC SURVEYREG computes as the ratio of the stratum sample size to the stratum total. If you input stratum sampling rates, PROC SURVEYREG uses these values directly for . If you do not specify the TOTAL= or RATE= option, then the procedure assumes that the stratum sampling rates are negligible, and a finite population correction is not used when computing variances.