The SURVEYFREQ Procedure |
Domain Analysis |
PROC SURVEYFREQ provides domain analysis through its multiway table capability. Domain analysis refers to the computation of statistics for subpopulations, or domains, in addition to the computation of statistics for the entire study population. Formation of subpopulations can be unrelated to the sample design, and so the domain sample sizes can actually be random variables. Domain analysis takes this variability into account by using the entire sample in estimating the variance of domain estimates. Domain analysis is also known as subgroup analysis, subpopulation analysis, or subdomain analysis. For more information about domain analysis, see Lohr (1999), Cochran (1977), and Fuller et al. (1989).
To request domain analysis with PROC SURVEYFREQ, you should include the domain variable(s) in your TABLES statement request. For example, specifying DOMAIN * A * B in a TABLES statement produces separate two-way tables of A by B for each level of DOMAIN. If your domains are formed by more than one variable, you can specify DomainVariable_1 * DomainVariable_2 * A * B, for example, to obtain two-way tables of A by B for each domain formed by the different combinations of levels for DomainVariable_1 and DomainVariable_2. See Example 83.2 for an example of domain analysis.
If you specify DOMAIN * A in a TABLES statement, the values of the variable DOMAIN form the table rows. The two-way table lists levels of the variable A within each level of the row variable DOMAIN. Specify the ROW option in the TABLES statement to obtain the row percentages and their standard errors. This provides the one-way distribution of A for each domain, or level of the variable DOMAIN.
Including the domain variables in a TABLES statement request gives a different analysis from that obtained by using a BY statement, which provides completely separate analyses of the BY groups. The BY statement can also be used to analyze the data set by subgroups, but it is critical to note that this will not produce a valid domain analysis. The BY statement is appropriate only when the number of units in each subgroup is known with certainty; when the subgroup sample size is a random variable, include the domain variables in your TABLES statement request.
Copyright © 2009 by SAS Institute Inc., Cary, NC, USA. All rights reserved.