Recently replication methods have gained popularity for estimating variances in complex survey data analysis. One reason for this popularity is the relative simplicity of replication-based estimates, especially for nonlinear estimators; another is that modern computational capacity has made replication methods feasible for practical survey analysis. For more information, see Lohr (2010); Wolter (2007); Rust (1985); Dippo, Fay, and Morganstein (1984); Rao and Shao (1999); Rao, Wu, and Yue (1992); Rao and Shao (1996).
Replication methods draw multiple replicates (also called subsamples) from a full sample according to a specific resampling scheme. The most commonly used resampling schemes are the balanced repeated replication (BRR) method and the jackknife method. For each replicate, the original weights are modified for the PSUs in the replicates to create replicate weights. The statistics of interest are estimated by using the replicate weights for each replicate. Then the variances of parameters of interest are estimated by the variability among the estimates derived from these replicates. You can use the REPWEIGHTS statement to provide your own replicate weights for variance estimation.
The balanced repeated replication (BRR) method requires that the full sample be drawn by using a stratified sample design
                  with two primary sampling units (PSUs) per stratum. Let H be the total number of strata. The total number of replicates R is the smallest multiple of 4 that is greater than H. However, if you prefer a larger number of replicates, you can specify the REPS=number
                   option. If a number  number Hadamard matrix
                   cannot be constructed, the number of replicates is increased until a Hadamard matrix becomes available.
 number Hadamard matrix
                   cannot be constructed, the number of replicates is increased until a Hadamard matrix becomes available. 
               
Each replicate is obtained by deleting one PSU per stratum according to the corresponding Hadamard matrix and adjusting the original weights for the remaining PSUs. The new weights are called replicate weights.
Replicates are constructed by using the first H columns of the  Hadamard matrix
                  . The rth (
 Hadamard matrix
                  . The rth ( ) replicate is drawn from the full sample according to the rth row of the Hadamard matrix as follows:
) replicate is drawn from the full sample according to the rth row of the Hadamard matrix as follows: 
               
If the  element of the Hadamard matrix is 1, then the first PSU of stratum h is included in the rth replicate and the second PSU of stratum h is excluded.
 element of the Hadamard matrix is 1, then the first PSU of stratum h is included in the rth replicate and the second PSU of stratum h is excluded. 
                        
If the  element of the Hadamard matrix is –1, then the second PSU of stratum h is included in the rth replicate and the first PSU of stratum h is excluded.
 element of the Hadamard matrix is –1, then the second PSU of stratum h is included in the rth replicate and the first PSU of stratum h is excluded. 
                        
Note that the "first" and "second" PSUs are determined by data order in the input data set. Thus, if you reorder the data set and perform the same analysis by using BRR method, you might get slightly different results, because the contents in each replicate sample might change.
The replicate weights of the remaining PSUs in each half-sample are then doubled to their original weights. For more details about the BRR method, see Wolter (2007) and Lohr (2010).
By default, an appropriate Hadamard matrix is generated automatically to create the replicates. You can request that the Hadamard matrix be displayed by specifying the VARMETHOD=BRR(PRINTH) method-option. If you provide a Hadamard matrix by specifying the VARMETHOD=BRR(HADAMARD=) method-option, then the replicates are generated according to the provided Hadamard matrix.
You can use the VARMETHOD=BRR(OUTWEIGHTS=) method-option to save the replicate weights into a SAS data set.
Suppose that  is a population parameter of interest. Let
 is a population parameter of interest. Let  be the estimate from the full sample for
 be the estimate from the full sample for  . Let
. Let  be the estimate from the rth replicate subsample by using replicate weights. PROC SURVEYMEANS estimates the variance of
 be the estimate from the rth replicate subsample by using replicate weights. PROC SURVEYMEANS estimates the variance of  by
 by 
               
![\[ \widehat{V}(\hat{\theta }) = \frac{1}{R} \sum _{r=1}^ R \left( \hat{\theta _ r} - \hat{\theta } \right)^2 \]](images/statug_surveymeans0240.png)
with H degrees of freedom, where H is the number of strata.
If a parameter cannot be computed from one or more replicates, then the variance estimate is computed by using those replicates from which the parameter can be estimated. For example, suppose the parameter is a ratio. If a replicate r contains observations such that the denominator of the ratio is zero, then the ratio cannot be computed from replicate r. In this case, the BRR variance estimate is computed as
![\[ \widehat{V}(\hat{\theta }) = \frac{1}{R'} \sum _{r=1}^{R'} \left( \hat{\theta }_ r - \hat{\theta } \right)^2 \]](images/statug_surveymeans0241.png)
 where the summation is over the replicates where the parameter  can be computed, and
 can be computed, and  is the number of those replicates.
 is the number of those replicates. 
               
Fay’s method is a modification of the BRR method, and it requires a stratified sample design with two primary sampling units (PSUs) per stratum. The total number of replicates R is the smallest multiple of 4 that is greater than the total number of strata H. However, if you prefer a larger number of replicates, you can specify the REPS= method-option.
For each replicate, Fay’s method uses a Fay coefficient  to impose a perturbation of the original weights in the full sample that is gentler than using only half-samples, as in the
                  traditional BRR method. The Fay coefficient
 to impose a perturbation of the original weights in the full sample that is gentler than using only half-samples, as in the
                  traditional BRR method. The Fay coefficient  can be set by specifying the FAY =
 can be set by specifying the FAY =  method-option. By default,
                   method-option. By default,  if the FAY
                   method-option is specified without providing a value for
 if the FAY
                   method-option is specified without providing a value for  (Judkins 1990; Rao and Shao 1999). When
 (Judkins 1990; Rao and Shao 1999). When  , Fay’s method becomes the traditional BRR
                   method. For more details, see Dippo, Fay, and Morganstein (1984); Fay (1984, 1989); Judkins (1990).
, Fay’s method becomes the traditional BRR
                   method. For more details, see Dippo, Fay, and Morganstein (1984); Fay (1984, 1989); Judkins (1990). 
               
Let H be the number of strata. Replicates are constructed by using the first H columns of the  Hadamard matrix
                  , where R is the number of replicates,
 Hadamard matrix
                  , where R is the number of replicates,  . The rth (
. The rth ( ) replicate is created from the full sample according to the rth row of the Hadamard matrix as follows:
) replicate is created from the full sample according to the rth row of the Hadamard matrix as follows: 
               
If the  element of the Hadamard matrix is 1, then the full sample weight of the first PSU in stratum h is multiplied by
 element of the Hadamard matrix is 1, then the full sample weight of the first PSU in stratum h is multiplied by  and the full sample weight of the second PSU is multiplied by
 and the full sample weight of the second PSU is multiplied by  to obtain the rth replicate weights.
 to obtain the rth replicate weights. 
                        
If the  element of the Hadamard matrix is –1, then the full sample weight of the first PSU in stratum h is multiplied by
 element of the Hadamard matrix is –1, then the full sample weight of the first PSU in stratum h is multiplied by  and the full sample weight of the second PSU is multiplied by
 and the full sample weight of the second PSU is multiplied by  to obtain the rth replicate weights.
 to obtain the rth replicate weights. 
                        
You can use the VARMETHOD=BRR(OUTWEIGHTS=) method-option to save the replicate weights into a SAS data set.
By default, an appropriate Hadamard matrix is generated automatically to create the replicates. You can request that the Hadamard matrix be displayed by specifying the VARMETHOD=BRR(PRINTH) method-option. If you provide a Hadamard matrix by specifying the VARMETHOD=BRR(HADAMARD=) method-option, then the replicates are generated according to the provided Hadamard matrix.
Suppose that  is a population parameter of interest. Let
 is a population parameter of interest. Let  be the estimate from the full sample for
 be the estimate from the full sample for  . Let
. Let  be the estimate from the rth replicate subsample by using replicate weights. PROC SURVEYMEANS estimates the variance of
 be the estimate from the rth replicate subsample by using replicate weights. PROC SURVEYMEANS estimates the variance of  by
 by 
               
![\[ \widehat{V}(\hat{\theta }) = \frac{1}{R{(1-\epsilon )}^2} \sum _{r=1}^ R \left( \hat{\theta _ r} - \hat{\theta } \right)^2 \]](images/statug_surveymeans0249.png)
with H degrees of freedom, where H is the number of strata.
 
                  
                  
                  
                  
                  
                  
                  
                  
                  
                  
                  
                  
                  
                  
                  The jackknife method of variance estimation deletes one PSU at a time from the full sample to create replicates. The total
                  number of replicates R is the same as the total number of PSUs. In each replicate, the sample weights of the remaining PSUs are modified by the
                  jackknife coefficient  . The modified weights are called replicate weights.
. The modified weights are called replicate weights. 
               
The jackknife coefficient and replicate weights are described as follows.
 If there is no stratification in the sample design (no STRATA
                      statement), the jackknife coefficients  are the same for all replicates:
 are the same for all replicates: 
                  
![\[ \alpha _ r=\frac{R-1}{R} \, \, \, \, \, \mbox{where } r=1, 2, ..., R \]](images/statug_surveymeans0250.png)
 Denote the original weight in the full sample for the jth member of the ith PSU as  . If the ith PSU is included in the rth replicate (
. If the ith PSU is included in the rth replicate ( ), then the corresponding replicate weight for the jth member of the ith PSU is defined as
), then the corresponding replicate weight for the jth member of the ith PSU is defined as 
                  
![\[ w^{(r)}_{ij}={w_{ij}}/{\alpha _ r} \]](images/statug_surveymeans0252.png)
If the sample design involves stratification, each stratum must have at least two PSUs to use the jackknife method.
Let stratum  be the stratum from which a PSU is deleted for the rth replicate. Stratum
 be the stratum from which a PSU is deleted for the rth replicate. Stratum  is called the donor stratum. Let
 is called the donor stratum. Let  be the total number of PSUs in the donor stratum
 be the total number of PSUs in the donor stratum  . The jackknife coefficients are defined as
. The jackknife coefficients are defined as 
                  
![\[ \alpha _ r=\frac{n_{{{\tilde h}_ r}}-1}{n_{{\tilde h}_ r}} \, \, \, \, \, \mbox{where } r=1, 2, ..., R \]](images/statug_surveymeans0256.png)
 Denote the original weight in the full sample for the jth member of the ith PSU as  . If the ith PSU is included in the rth replicate (
. If the ith PSU is included in the rth replicate ( ), then the corresponding replicate weight for the jth member of the ith PSU is defined as
), then the corresponding replicate weight for the jth member of the ith PSU is defined as 
                  
![\[ w^{(r)}_{ij}=\left\{ \begin{array}{ll} w_{ij} & \mbox{if }\mi{i}\mbox{th PSU is not in the donor stratum }\tilde h_ r \\ w_{ij}/\alpha _ r & \mbox{if }\mi{i}\mbox{th PSU is in the donor stratum }\tilde h_ r \end{array} \right. \]](images/statug_surveymeans0257.png)
You can use the VARMETHOD=JACKKNIFE(OUTJKCOEFS=) method-option to save the jackknife coefficients into a SAS data set and use the VARMETHOD=JACKKNIFE(OUTWEIGHTS=) method-option to save the replicate weights into a SAS data set.
If you provide your own replicate weights with a REPWEIGHTS
                      statement, then you can also provide corresponding jackknife coefficients with the JKCOEFS=
                      option. If you provide replicate weights but do not provide jackknife coefficients, PROC SURVEYMEANS uses  as the jackknife coefficient for all replicates.
 as the jackknife coefficient for all replicates. 
                  
Suppose that  is a population parameter of interest. Let
 is a population parameter of interest. Let  be the estimate from the full sample for
 be the estimate from the full sample for  . Let
. Let  be the estimate from the rth replicate subsample by using replicate weights. PROC SURVEYMEANS estimates the variance of
 be the estimate from the rth replicate subsample by using replicate weights. PROC SURVEYMEANS estimates the variance of  by
 by 
                  
![\[ \widehat{V}(\hat{\theta }) = \sum _{r=1}^ R \alpha _ r \left( \hat{\theta _ r} - \hat{\theta } \right)^2 \]](images/statug_surveymeans0259.png)
 with  degrees of freedom, where R is the number of replicates and H is the number of strata, or R – 1 when there is no stratification.
 degrees of freedom, where R is the number of replicates and H is the number of strata, or R – 1 when there is no stratification. 
                  
A Hadamard matrix  is a square matrix whose elements are either 1 or –1 such that
 is a square matrix whose elements are either 1 or –1 such that 
               
![\[ \mb{H}\mb{H}’=k\mb{I} \]](images/statug_surveymeans0262.png)
 where k is the dimension of  and
 and  is the identity matrix of order k. The order k is necessarily 1, 2, or a positive integer that is a multiple of 4.
 is the identity matrix of order k. The order k is necessarily 1, 2, or a positive integer that is a multiple of 4. 
               
For example, the following matrix is a Hadamard matrix of dimension k = 8:
![\[ \begin{array}{rrrrrrrr} 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1\\ 1 & -1 & 1 & -1 & 1 & -1 & 1 & -1\\ 1 & 1 & -1 & -1 & 1 & 1 & -1 & -1\\ 1 & -1 & -1 & 1 & 1 & -1 & -1 & 1\\ 1 & 1 & 1 & 1 & -1 & -1 & -1 & -1\\ 1 & -1 & 1 & -1 & -1 & 1 & -1 & 1\\ 1 & 1 & -1 & -1 & -1 & -1 & 1 & 1\\ 1 & -1 & -1 & 1 & -1 & 1 & 1 & -1 \end{array} \]](images/statug_surveymeans0264.png)