## Example 20.5 Creating an Output Data Set: Subsetting the Data

This example demonstrates how you can create an output data set with the ODS OUTPUT statement and also use data set selection keywords to limit the output that ODS writes to a SAS data set. The data set, called Color, contains the eye color and hair color of children from two different regions of Europe. The data are recorded as cell counts, where the variable Count contains the number of children who exhibit each of the 15 combinations of eye and hair color. The following statements create the SAS data set:

```title 'Hair Color of European Children';

data Color;
input Region Eyes \$ Hair \$ Count @@;
label Eyes  ='Eye Color'
Hair  ='Hair Color'
Region='Geographic Region';
datalines;
1 blue  fair   23  1 blue  red     7  1 blue  medium 24
1 blue  dark   11  1 green fair   19  1 green red     7
1 green medium 18  1 green dark   14  1 brown fair   34
1 brown red     5  1 brown medium 41  1 brown dark   40
1 brown black   3  2 blue  fair   46  2 blue  red    21
2 blue  medium 44  2 blue  dark   40  2 blue  black   6
2 green fair   50  2 green red    31  2 green medium 37
2 green dark   23  2 brown fair   56  2 brown red    42
2 brown medium 53  2 brown dark   54  2 brown black  13
;
```

The following statements exclude all output and sort the observations in the Color data set by the Region variable:

```ods select none;

proc sort data=Color;
by Region;
run;
```

The following ODS OUTPUT statement creates the ChiSq table as a SAS data set named myStats:

```ods output ChiSq=myStats(drop=Table
where=(Statistic =: 'Chi' or
Statistic =: 'Like'));
```

You specify the table name in the ODS OUTPUT statement.1 The DROP= data set option excludes variables from the new data set. The WHERE= data set option selects observations for output to the new data set myStats—specifically, those that begin with 'Chi' or 'Like'.

The following statements create Output 20.5.1:

```proc freq data=Color order=data;
weight Count;
tables Eyes*Hair / testp=(30 12 30 25 3);
by Region;
run;

ods select all;
proc print data=myStats noobs;
run;
```

The FREQ procedure is used to create and analyze a crosstabulation table from the two categorical variables Eyes and Hair, for each value of the variable Region.

Output 20.5.1 Output Data Set from PROC FREQ and ODS
 Hair Color of European Children

Region Statistic DF Value Prob
1 Chi-Square 8 12.6331 0.1251
1 Likelihood Ratio Chi-Square 8 14.1503 0.0779
2 Chi-Square 8 18.2839 0.0192
2 Likelihood Ratio Chi-Square 8 23.3021 0.0030