Previous Page | Next Page

Introduction to Survey Sampling and Analysis Procedures

Overview: Survey Sampling and Analysis Procedures

This chapter introduces the SAS/STAT procedures for survey sampling and describes how you can use these procedures to analyze survey data.

Researchers often use sample survey methodology to obtain information about a large population by selecting and measuring a sample from that population. Due to variability among items, researchers apply scientific probability-based designs to select the sample. This reduces the risk of a distorted view of the population and enables statistically valid inferences to be made from the sample. See Lohr (1999), Kalton (1983), Cochran (1977), and Kish (1965) for more information about statistical sampling and analysis of complex survey data. To select probability-based random samples from a study population, you can use the SURVEYSELECT procedure, which provides a variety of methods for probability sampling. To analyze sample survey data, you can use the SURVEYMEANS, SURVEYFREQ, SURVEYREG, and SURVEYLOGISTIC procedures, which incorporate the sample design into the analyses.

Many SAS/STAT procedures, such as the MEANS, FREQ, GLM and LOGISTIC procedures, can compute sample means, produce crosstabulation tables, and estimate regression relationships. However, in most of these procedures, statistical inference is based on the assumption that the sample is drawn from an infinite population by simple random sampling. If the sample is in fact selected from a finite population by using a complex survey design, these procedures generally do not calculate the estimates and their variances according to the design actually used. Using analyses that are not appropriate for your sample design can lead to incorrect statistical inferences.

The SURVEYMEANS, SURVEYFREQ, SURVEYREG, and SURVEYLOGISTIC procedures properly analyze complex survey data by taking into account the sample design. These procedures can be used for multistage or single-stage designs, with or without stratification, and with or without unequal weighting. The survey analysis procedures provide a choice of variance estimation methods, which include Taylor series linearization, balanced repeated replication (BRR), and the jackknife.

Table 14.1 briefly describes the sampling and analysis procedures in SAS/STAT software.

Table 14.1 Sampling and Analysis Procedures in SAS/STAT Software

PROC SURVEYSELECT

 

Sampling Methods

simple random sampling

 

unrestricted random sampling (with replacement)

 

systematic

 

sequential

 

probability proportional to size (PPS) sampling

 

with and without replacement

 

PPS systematic

 

PPS for two units per stratum

 

PPS sequential with minimum replacement

Allocation Methods

proportional

 

optimal

 

Neyman

PROC SURVEYMEANS

 

Statistics

estimates of population means and totals

 

estimates of population proportions

 

estimates of population quantiles

 

ratio estimates

 

standard errors

 

confidence limits

 

hypothesis tests

 

domain analysis


PROC SURVEYFREQ

 

Tables

one-way frequency tables

 

two-way and multiway crosstabulation tables

 

estimates of population totals and proportions

 

standard errors

 

confidence limits

Analyses

tests of goodness of fit

 

tests of independence

 

risks and risk differences

 

odds ratios and relative risks

PROC SURVEYREG

 

Analyses

linear regression model fitting

 

regression coefficients

 

covariance matrices

 

confidence limits

 

hypothesis tests

 

estimable functions

 

contrasts

 

predicted values and residuals

 

domain analysis

PROC SURVEYLOGISTIC

 

Analyses

cumulative logit regression model fitting

 

logit, probit, and complementary log-log link functions

 

generalized logit regression model fitting

 

regression coefficients

 

covariance matrices

 

confidence limits

 

hypothesis tests

 

odds ratios

 

estimable functions

 

contrasts

 

model diagnostics

 

domain analysis

Previous Page | Next Page | Top of Page