The SURVEYREG Procedure |
Recall that in the section Getting Started: SURVEYREG Procedure, you collected a stratified simple random sample from a junior high school to examine how household income and the number of children in a household affect students’ average weekly spending for ice cream. You can also use the same sample to estimate the average weekly spending among male and female students, respectively. This is often called domain analysis (subgroup analysis). You can use PROC SURVEYREG to perform domain analysis as in this example. The data set follows:
data IceCreamDataDomain; input Grade Spending Income Gender$ @@; datalines; 7 7 39 M 7 7 38 F 8 12 47 F 9 10 47 M 7 1 34 M 7 10 43 M 7 3 44 M 8 20 60 F 8 19 57 M 7 2 35 M 7 2 36 F 9 15 51 F 8 16 53 F 7 6 37 F 7 6 41 M 7 6 39 M 9 15 50 M 8 17 57 F 8 14 46 M 9 8 41 M 9 8 41 F 9 7 47 F 7 3 39 F 7 12 50 M 7 4 43 M 9 14 46 F 8 18 58 M 9 9 44 F 7 2 37 F 7 1 37 M 7 4 44 M 7 11 42 M 9 8 41 M 8 10 42 M 8 13 46 F 7 2 40 F 9 6 45 F 9 11 45 M 7 2 36 F 7 9 46 F ; data IceCreamDataDomain; set IceCreamDataDomain; if Grade=7 then Prob=20/1824; if Grade=8 then Prob=9/1025; if Grade=9 then Prob=11/1151; Weight=1/Prob;
In the data set IceCreamDataDomain, the variable Grade indicates a student’s grade, which is the stratification variable. The variable Spending contains the dollar amount of each student’s average weekly spending for ice cream. The variable Income specifies the household income, in thousands of dollars. The variable Gender indicates a student’s gender. The sampling weights are created by using the reciprocals of the probabilities of selection, as follows:
data StudentTotals; input Grade _TOTAL_; datalines; 7 1824 8 1025 9 1151 ;
In the data set StudentTotals, the variable Grade is the stratification variable, and the variable _TOTAL_ contains the total numbers of students in the strata in the survey population.
The following statements demonstrate how you can estimate the average spending in the subgroup of male students:
title1 'Ice Cream Spending Analysis'; title2 'Domain Analysis by Gender'; proc surveyreg data=IceCreamDataDomain total=StudentTotals; strata Grade; model Spending = Income; domain Gender; run;
Output 86.7.1 gives a summary of the domains.
Domain Summary | |
---|---|
Number of Observations | 40 |
Number of Observations in Domain | 19 |
Number of Observations Not in Domain | 21 |
Mean of Spending | 8.94737 |
Sum of Spending | 170.00000 |
Ice Cream Spending Analysis |
Domain Analysis by Gender |
Domain Summary | |
---|---|
Number of Observations | 40 |
Number of Observations in Domain | 21 |
Number of Observations Not in Domain | 19 |
Mean of Spending | 8.57143 |
Sum of Spending | 180.00000 |
Output 86.7.2 shows that parameter estimates for the model within each domain.
Estimated Regression Coefficients | ||||
---|---|---|---|---|
Parameter | Estimate | Standard Error | t Value | Pr > |t| |
Intercept | -23.897418 | 2.38307272 | -10.03 | <.0001 |
Income | 0.737649 | 0.04973471 | 14.83 | <.0001 |
Note: | The denominator degrees of freedom for the t tests is 37. |
Ice Cream Spending Analysis |
Domain Analysis by Gender |
Estimated Regression Coefficients | ||||
---|---|---|---|---|
Parameter | Estimate | Standard Error | t Value | Pr > |t| |
Intercept | -23.342282 | 2.11458083 | -11.04 | <.0001 |
Income | 0.730052 | 0.04587826 | 15.91 | <.0001 |
Note: | The denominator degrees of freedom for the t tests is 37. |
Copyright © 2009 by SAS Institute Inc., Cary, NC, USA. All rights reserved.