PROC UNIVARIATE provides three tests for location: Student’s t test, the sign test, and the Wilcoxon signed rank test. All three tests produce a test statistic for the null hypothesis that the mean or median is equal to a given value against the two-sided alternative that the mean or median is not equal to . By default, PROC UNIVARIATE sets the value of to zero. You can use the MU0= option in the PROC UNIVARIATE statement to specify the value of . Student’s t test is appropriate when the data are from an approximately normal population; otherwise, use nonparametric tests such as the sign test or the signed rank test. For large sample situations, the t test is asymptotically equivalent to a z test. If you use the WEIGHT statement, PROC UNIVARIATE computes only one weighted test for location, the t test. You must use the default value for the VARDEF= option in the PROC statement (VARDEF=DF). See Example 4.12.
You can also use these tests to compare means or medians of paired data. Data are said to be paired when subjects or units are matched in pairs according to one or more variables, such as pairs of subjects with the same age and gender. Paired data also occur when each subject or unit is measured at two times or under two conditions. To compare the means or medians of the two times, create an analysis variable that is the difference between the two measures. The test that the mean or the median difference of the variables equals zero is equivalent to the test that the means or medians of the two original variables are equal. Note that you can also carry out these tests by using the PAIRED statement in the TTEST procedure; see Chapter 119: The TTEST Procedure in SAS/STAT 14.1 User's Guide. Also see Example 4.13.
PROC UNIVARIATE calculates the t statistic as
where is the sample mean, n is the number of nonmissing values for a variable, and s is the sample standard deviation. The null hypothesis is that the population mean equals . When the data values are approximately normally distributed, the probability under the null hypothesis of a t statistic that is as extreme, or more extreme, than the observed value (the p-value) is obtained from the t distribution with degrees of freedom. For large n, the t statistic is asymptotically equivalent to a z test. When you use the WEIGHT statement and the default value of VARDEF=, which is DF, the t statistic is calculated as
where is the weighted mean, is the weighted standard deviation, and is the weight for ith observation. The statistic is treated as having a Student’s t distribution with degrees of freedom. If you specify the EXCLNPWGT option in the PROC statement, n is the number of nonmissing observations when the value of the WEIGHT variable is positive. By default, n is the number of nonmissing observations for the WEIGHT variable.
PROC UNIVARIATE calculates the sign test statistic as
where is the number of values that are greater than , and is the number of values that are less than . Values equal to are discarded. Under the null hypothesis that the population median is equal to , the p-value for the observed statistic is
where is the number of values not equal to .
Note: If and are equal, the p-value is equal to one.
The signed rank statistic S is computed as
where is the rank of after discarding values of , and is the number of values not equal to . Average ranks are used for tied values.
If , the significance of S is computed from the exact distribution of S, where the distribution is a convolution of scaled binomial distributions. When , the significance of S is computed by treating
as a Student’s t variate with degrees of freedom. V is computed as
where the sum is over groups tied in absolute value and where is the number of values in the ith group (Iman 1974; Conover 1980). The null hypothesis tested is that the mean (or median) is , assuming that the distribution is symmetric. Refer to Lehmann and D’Abrera (1975).