![]() | ![]() | ![]() |
The TESTP= and TESTF= options in PROC FREQ specify the null hypothesis (expected) proportions or frequencies for a chi-square goodness-of-fit test on a one-way table. Prior to SAS 9.3 TS1M2, these options did not allow the input of expected values from a data set. Beginning in SAS 9.3 TS1M2, expected values stored in a data set can be directly read by specifying a data set name in either of these options. This note further discusses and illustrates this capability. Prior to that release, the method below can be used to read proportions stored in a SAS data set. This method stores the expected values in a macro variable which is then specified in the TESTP= or TESTF= option.
Suppose you want to test if children's hair color has a specified multinomial distribution. These statements create the data set containing the results from a survey of children.
data Color;
input Hair $ Count;
label Hair='Hair Color';
datalines;
fair 76
dark 19
medium 83
red 65
black 3
;
You hypothesize that the distribution of hair color is 30% fair, 12% dark, 30% medium, 25% red, and 3% black. These hypothesized proportions are stored in the data set below. Note that the ordering of the levels for HAIR is identical to that in the COLOR data set above and the proportions sum to one.
data true_vals;
input Hair $ p;
datalines;
fair .30
dark .12
medium .30
red .25
black .03
;
In order to use these proportions in the TESTP= option, create a single macro variable (&VALUES) whose value is the list of the proportions separated by spaces. The CALL SYMPUT statement below builds the macro variable by adding a space and each proportion to the list as it reads through the data set.
%let values=;
data _null_;
set true_vals;
call symput('values',symget('values')||' '||trim(left(p)));
run;
This statement displays the contents of the macro variable in the SAS log.
%put &values;
| 0.3 0.12 0.3 0.25 0.03 |
The following PROC FREQ step tests the hypothesis that hair color has the specified proportions. The option ORDER=DATA orders the frequency table values (hair color) by their order in the data set. The TABLES statement requests a frequency table for hair color, and the option NOCUM suppresses the display of the cumulative frequencies and percentages. The hypothesized proportions for the chi-square test are included in the TESTP= option by specifying the macro variable.
proc freq data=Color order=data;
weight Count;
tables Hair / nocum testp=(&values);
title 'Hair Color of European Children';
run;
The displayed frequency table lists the hair colors in the order in which they appeared in the data set. The "Test Percent" column lists the hypothesized proportions for the chi-square test. It is good practice to always check that the hypothesized proportions are associated with the intended variable levels.
PROC FREQ computes the chi-square goodness-of-fit statistic. The chi-square statistic is not significant at the 0.05 level (p=0.1008) meaning that there is not a significant departure from the hypothesized values.
| Product Family | Product | System | SAS Release | |
| Reported | Fixed* | |||
| SAS System | SAS/STAT | z/OS | ||
| OpenVMS VAX | ||||
| Microsoft® Windows® for 64-Bit Itanium-based Systems | ||||
| Microsoft Windows Server 2003 Datacenter 64-bit Edition | ||||
| Microsoft Windows Server 2003 Enterprise 64-bit Edition | ||||
| Microsoft Windows XP 64-bit Edition | ||||
| Microsoft® Windows® for x64 | ||||
| OS/2 | ||||
| Microsoft Windows 7 | ||||
| Microsoft Windows 95/98 | ||||
| Microsoft Windows 2000 Advanced Server | ||||
| Microsoft Windows 2000 Datacenter Server | ||||
| Microsoft Windows 2000 Server | ||||
| Microsoft Windows 2000 Professional | ||||
| Microsoft Windows NT Workstation | ||||
| Microsoft Windows Server 2003 Datacenter Edition | ||||
| Microsoft Windows Server 2003 Enterprise Edition | ||||
| Microsoft Windows Server 2003 Standard Edition | ||||
| Microsoft Windows Server 2008 | ||||
| Microsoft Windows XP Professional | ||||
| Windows Millennium Edition (Me) | ||||
| Windows Vista | ||||
| 64-bit Enabled AIX | ||||
| 64-bit Enabled HP-UX | ||||
| 64-bit Enabled Solaris | ||||
| ABI+ for Intel Architecture | ||||
| AIX | ||||
| HP-UX | ||||
| HP-UX IPF | ||||
| IRIX | ||||
| Linux | ||||
| Linux for x64 | ||||
| Linux on Itanium | ||||
| OpenVMS Alpha | ||||
| OpenVMS on HP Integrity | ||||
| Solaris | ||||
| Solaris for x64 | ||||
| Tru64 UNIX | ||||
| Type: | Usage Note |
| Priority: | |
| Topic: | Analytics ==> Categorical Data Analysis SAS Reference ==> Procedures ==> FREQ |
| Date Modified: | 2009-12-07 10:08:55 |
| Date Created: | 2009-12-01 13:07:55 |


