PROC UNIVARIATE provides three tests for location: Student’s test, the sign test, and the Wilcoxon signed rank test. All three tests produce a test statistic for the null hypothesis that the mean or median is equal to a given value against the two-sided alternative that the mean or median is not equal to . By default, PROC UNIVARIATE sets the value of to zero. You can use the MU0= option in the PROC UNIVARIATE statement to specify the value of . Student’s test is appropriate when the data are from an approximately normal population; otherwise, use nonparametric tests such as the sign test or the signed rank test. For large sample situations, the test is asymptotically equivalent to a test. If you use the WEIGHT statement, PROC UNIVARIATE computes only one weighted test for location, the test. You must use the default value for the VARDEF= option in the PROC statement (VARDEF=DF). See Example 4.12.
You can also use these tests to compare means or medians of paired data. Data are said to be paired when subjects or units are matched in pairs according to one or more variables, such as pairs of subjects with the same age and gender. Paired data also occur when each subject or unit is measured at two times or under two conditions. To compare the means or medians of the two times, create an analysis variable that is the difference between the two measures. The test that the mean or the median difference of the variables equals zero is equivalent to the test that the means or medians of the two original variables are equal. Note that you can also carry out these tests by using the PAIRED statement in the TTEST procedure; see Chapter 99: The TTEST Procedure in SAS/STAT 12.1 User's Guide,. Also see Example 4.13.
PROC UNIVARIATE calculates the statistic as
|
where is the sample mean, is the number of nonmissing values for a variable, and is the sample standard deviation. The null hypothesis is that the population mean equals . When the data values are approximately normally distributed, the probability under the null hypothesis of a statistic that is as extreme, or more extreme, than the observed value (the -value) is obtained from the distribution with degrees of freedom. For large , the statistic is asymptotically equivalent to a test. When you use the WEIGHT statement and the default value of VARDEF=, which is DF, the statistic is calculated as
|
where is the weighted mean, is the weighted standard deviation, and is the weight for th observation. The statistic is treated as having a Student’s distribution with degrees of freedom. If you specify the EXCLNPWGT option in the PROC statement, is the number of nonmissing observations when the value of the WEIGHT variable is positive. By default, is the number of nonmissing observations for the WEIGHT variable.
PROC UNIVARIATE calculates the sign test statistic as
|
where is the number of values that are greater than , and is the number of values that are less than . Values equal to are discarded. Under the null hypothesis that the population median is equal to , the -value for the observed statistic is
|
where is the number of values not equal to .
Note: If and are equal, the -value is equal to one.
The signed rank statistic is computed as
|
where is the rank of after discarding values of , and is the number of values not equal to . Average ranks are used for tied values.
If , the significance of is computed from the exact distribution of , where the distribution is a convolution of scaled binomial distributions. When , the significance of is computed by treating
|
as a Student’s variate with degrees of freedom. is computed as
|
where the sum is over groups tied in absolute value and where is the number of values in the th group (Iman, 1974; Conover, 1999). The null hypothesis tested is that the mean (or median) is , assuming that the distribution is symmetric. Refer to Lehmann (1998).