### Example 94.7 Domain Analysis

Recall the example in the section Getting Started: SURVEYREG Procedure, which analyzed a stratified simple random sample from a junior high school to examine how household income and the number of children in a household affect students’ average weekly spending for ice cream. You can use the same sample to analyze the average weekly spending among male and female students. Because student gender is unrelated to the design of the sample, this kind of analysis is called domain analysis (subgroup analysis).

This example shows how you can use PROC SURVEYREG to perform domain analysis. The data set follows:

```data IceCreamDataDomain;
input Grade Spending Income Gender\$ @@;
datalines;
7   7  39  M   7   7  38  F   8  12  47  F
9  10  47  M   7   1  34  M   7  10  43  M
7   3  44  M   8  20  60  F   8  19  57  M
7   2  35  M   7   2  36  F   9  15  51  F
8  16  53  F   7   6  37  F   7   6  41  M
7   6  39  M   9  15  50  M   8  17  57  F
8  14  46  M   9   8  41  M   9   8  41  F
9   7  47  F   7   3  39  F   7  12  50  M
7   4  43  M   9  14  46  F   8  18  58  M
9   9  44  F   7   2  37  F   7   1  37  M
7   4  44  M   7  11  42  M   9   8  41  M
8  10  42  M   8  13  46  F   7   2  40  F
9   6  45  F   9  11  45  M   7   2  36  F
7   9  46  F
;

Weight=1/Prob;
run;
```

In the data set `IceCreamDataDomain`, the variable `Grade` indicates a student’s grade, which is the stratification variable. The variable `Spending` contains the dollar amount of each student’s average weekly spending for ice cream. The variable `Income` specifies the household income, in thousands of dollars. The variable `Gender` indicates a student’s gender. The sampling weights are created by using the reciprocals of the probabilities of selection, as follows:

```data StudentTotals;
datalines;
7 1824
8 1025
9 1151
;
```

In the data set `StudentTotals`, the variable `Grade` is the stratification variable, and the variable `_TOTAL_` contains the total numbers of students in the strata in the survey population.

The following statements demonstrate how you can analyze the relationship between spending and income among male and female students:

```title1 'Ice Cream Spending Analysis';
title2 'Domain Analysis by Gender';
model Spending = Income;
domain Gender;
weight Weight;
run;
```

Output 94.7.1 gives a summary of the domains.

Output 94.7.1: Domain Analysis Summary

 Ice Cream Spending Analysis Domain Analysis by Gender

The SURVEYREG Procedure

Gender=F

Domain Regression Analysis for Variable Spending

Domain Summary
Number of Observations 40
Number of Observations in Domain 19
Number of Observations Not in Domain 21
Sum of Weights in Domain 1926.9
Weighted Mean of Spending 9.37611
Weighted Sum of Spending 18066.5

 Ice Cream Spending Analysis Domain Analysis by Gender

The SURVEYREG Procedure

Gender=M

Domain Regression Analysis for Variable Spending

Domain Summary
Number of Observations 40
Number of Observations in Domain 21
Number of Observations Not in Domain 19
Sum of Weights in Domain 2073.1
Weighted Mean of Spending 8.92305
Weighted Sum of Spending 18498.7

Output 94.7.2 shows the parameter estimates for the model within each domain.

Output 94.7.2: Parameter Estimates within Domain

 Ice Cream Spending Analysis Domain Analysis by Gender

The SURVEYREG Procedure

Gender=F

Domain Regression Analysis for Variable Spending

Estimated Regression Coefficients
Parameter Estimate Standard Error t Value Pr > |t|
Intercept -23.751681 2.30795437 -10.29 <.0001
Income 0.735366 0.04757001 15.46 <.0001

 Note: The denominator degrees of freedom for the t tests is 37.

 Ice Cream Spending Analysis Domain Analysis by Gender

The SURVEYREG Procedure

Gender=M

Domain Regression Analysis for Variable Spending

Estimated Regression Coefficients
Parameter Estimate Standard Error t Value Pr > |t|
Intercept -23.213291 2.13361241 -10.88 <.0001
Income 0.729419 0.04589801 15.89 <.0001

 Note: The denominator degrees of freedom for the t tests is 37.

For this particular example, the effect `Income` is significant for both models built within subgroups of male and female students, and the models are quite similar. In many other cases, regression models vary from subgroup to subgroup.