The SURVEYFREQ Procedure

Row and Column Proportions

PROC SURVEYFREQ computes the estimate of the row proportion for table cell (r, c) as the ratio of the estimated total for the table cell to the estimated total for row r,

$\displaystyle  \widehat{P}_{rc}^{~ r}  $
$\displaystyle  =  $
$\displaystyle  \widehat{N}_{rc} ~  / ~  \widehat{N}_{r \cdot }  $
$\displaystyle  $
$\displaystyle  =  $
$\displaystyle  \left( \sum _{h=1}^ H ~  \sum _{i=1}^{n_ h} ~  \sum _{j=1}^{m_{hi}} ~  {\delta _{hij} (r,c) ~  W_{hij}} \right) ~  / ~  \left( \sum _{h=1}^ H ~  \sum _{i=1}^{n_ h} ~  \sum _{j=1}^{m_{hi}} ~  {\delta _{hij} (r ~  \cdot ) ~  W_{hij}} \right)  $

Similarly, PROC SURVEYFREQ estimates the column proportion for table cell (r, c) as the ratio of the estimated total for the table cell to the estimated total for column c,

$\displaystyle  \widehat{P}_{rc}^{~ c}  $
$\displaystyle  =  $
$\displaystyle  \widehat{N}_{rc} ~  / ~  \widehat{N}_{\cdot c}  $
$\displaystyle  $
$\displaystyle  =  $
$\displaystyle  \left( \sum _{h=1}^ H ~  \sum _{i=1}^{n_ h} ~  \sum _{j=1}^{m_{hi}} ~  {\delta _{hij} (r,c) ~  W_{hij}} \right) ~  / ~  \left( \sum _{h=1}^ H ~  \sum _{i=1}^{n_ h} ~  \sum _{j=1}^{m_{hi}} ~  {\delta _{hij} (\cdot ~  c) ~  W_{hij}} \right)  $

If you request BRR variance estimation (VARMETHOD=BRR), PROC SURVEYFREQ estimates the variances of the row and column proportions as described in the section Balanced Repeated Replication (BRR). If you request jackknife variance estimation (VARMETHOD=JACKKNIFE), the procedure estimates the variances as described in the section The Jackknife Method.

If you do not specify the VARMETHOD= option or a REPWEIGHTS statement, the default variance estimation method is Taylor series (VARMETHOD=TAYLOR). By using Taylor series linearization, the variance of the row proportion estimate can be expressed as

\[  \widehat{\mr {Var}}(\widehat{P}_{rc}^{~ r}) = \sum _{h=1}^ H \widehat{\mr {Var}}_ h(\widehat{P}_{rc})  \]

where if $n_ h > 1$,

$\displaystyle  \widehat{\mr {Var}}_ h(\widehat{P}_{rc}^{~ r})  $
$\displaystyle = $
$\displaystyle  \frac{n_ h(1-f_ h)}{n_ h-1} ~  \sum _{i=1}^{n_ h} (g_{rc}^{~ hi} - \bar{g}_{rc}^{~ h})^2  $
$\displaystyle g_{rc}^{~ hi}  $
$\displaystyle = $
$\displaystyle  \left( \sum _{j=1}^{m_{hi}} ({\delta _{hij} (r,c) - \widehat{P}_{rc}^{~ r} ~  \delta _{hij} (r ~  \cdot )) ~  W_{hij}} \right) ~  / ~  \widehat{N}_{r \cdot }  $
$\displaystyle  \bar{g}_{rc}^{~ h}  $
$\displaystyle = $
$\displaystyle  \left( \sum _{i=1}^{n_ h} ~  {g_{rc}^{~ hi}} \right) ~  / ~  n_ h  $

and if $n_ h = 1$,

\[  \widehat{\mr {Var}_ h}(\widehat{P}_{rc}^{~ r}) = \left\{  \begin{array}{ll} \mbox{missing} &  \mbox{ if } n_{h}=1 \mbox{ for } h’=1, 2, \ldots , H \\ 0 &  \mbox{ if } n_{h}>1 \mbox{ for some } 1 \leq h’ \leq H \end{array} \right.  \]

The standard error of the row proportion is computed as

\[  \mr {StdErr}( \widehat{P}_{rc}^{~ r} ) = \sqrt { \widehat{\mr {Var}}( \widehat{P}_{rc}^{~ r} ) }  \]

The Taylor series variance estimate for the column proportion is computed as described previously for the row proportion, but with

\[  g_{rc}^{~ hi} = \left( \sum _{j=1}^{m_{hi}} ({\delta _{hij} (r,c) - \widehat{P}_{rc}^{~ c} ~  \delta _{hij} (\cdot ~  c)) ~  W_{hij}} \right) ~  / ~  \widehat{N}_{\cdot c}  \]