The CLUSTER Procedure

Ultrametrics

A dissimilarity measure $d(x,y)$ is called an ultrametric if it satisfies the following conditions:

  • $d(x,x) = 0$ for all x

  • $d(x,y) \geq 0$ for all x, y

  • $d(x,y) = d(y,x)$ for all x, y

  • $d(x,y) \leq \max \left( d(x,z) , d(y,z) \right)$ for all x, y, and z

Any hierarchical clustering method induces a dissimilarity measure on the observations—say, $h(x_ i,x_ j)$. Let $C_ M$ be the cluster with the fewest members that contains both $x_ i$ and $x_ j$. Assume $C_ M$ was formed by joining $C_ K$ and $C_ L$. Then define $h(x_ i,x_ j) = D_{KL}$.

If the fusion of $C_ K$ and $C_ L$ reduces the number of clusters from g to $g - 1$, then define $D_{(g)} = D_{KL}$. Johnson (1967) shows that if

\[ 0 \leq D_{(n)} \leq D_{(n-1)} \leq \cdots \leq D_{(2)} \]

then $h(\cdot ,\cdot )$ is an ultrametric. A method that always satisfies this condition is said to be a monotonic or ultrametric clustering method. All methods implemented in PROC CLUSTER except CENTROID, EML, and MEDIAN are ultrametric (Milligan 1979; Batagelj 1981).