The CORR Procedure

Kendall’s Tau-b Correlation Coefficient

Probability Values

Kendall’s tau-b is a nonparametric measure of association based on the number of concordances and discordances in paired observations. Concordance occurs when paired observations vary together, and discordance occurs when paired observations vary differently. The formula for Kendall’s tau-b is

$\tau = \frac{\sum _{i<j} \, (\mr {sgn}(x_ i-x_ j) \mr {sgn}(y_ i-y_ j))}{\sqrt {(T_0-T_1)(T_0-T_2)}}$

where , $T_1= \sum _ k \, t_ k(t_ k-1)/2$ , and $T_2= \sum _ l \, u_ l(u_ l-1)/2$ . The is the number of tied values in the th group of tied values, is the number of tied values in the th group of tied values, is the number of observations, and $\mr {sgn}(z)$ is defined as

$\mr {sgn}(z) = \left\{ \begin{array}{ll} 1 & \mr {if} \, \, z > 0 \\ 0 & \mr {if} \, \, z = 0 \\ -1 & \mr {if} \, \, z < 0 \end{array} \right.$

PROC CORR computes Kendall’s tau-b by ranking the data and using a method similar to Knight (1966). The data are double sorted by ranking observations according to values of the first variable and reranking the observations according to values of the second variable. PROC CORR computes Kendall’s tau-b from the number of interchanges of the first variable and corrects for tied pairs (pairs of observations with equal values of X or equal values of Y).

Probability Values

Probability values for Kendall’s tau-b are computed by treating

$\frac{s}{\sqrt {V(s)}}$

as coming from a standard normal distribution where

$s=\sum _{i<j} \, (\mr {sgn} (x_ i-x_ j) \mr {sgn} (y_ i-y_ j))$

and , the variance of , is computed as

$V(s)=\frac{v_0-v_ t-v_ u}{18}+\frac{v_1}{2n(n-1)}+\frac{v_2}{9n(n-1)(n-2)}$

where


: $v_ t=\sum _ k \, t_ k (t_ k-1)(2t_ k+5)$
: $v_ u=\sum _ l \, u_ l (u_ l-1)(2u_ l+5)$
: $v_1=(\sum _ k \, t_ k(t_ k-1)) \, (\sum u_ i(u_ l-1))$
: $v_2=(\sum _ l \, t_ i(t_ k-1)(t_ k-2)) \, (\sum u_ l(u_ l-1)(u_ l-2))$

The sums are over tied groups of values where is the number of tied values and is the number of tied values (Noether, 1967). The sampling distribution of Kendall’s partial tau-b is unknown; therefore, the probability values are not available.