Skip to article frontmatterSkip to article content

Hierarchical clustering

Linkage distances

Consider the following two sets of points: S1={1,2,3}S_1 = \{1,2,3\} and S2={4,5,6}S_2 = \{4,5,6\}. What is the min linkage (aka single linkage) distance between S1S_1 and S2S_2? What is the max linkage (aka complete linkage) distance between S1S_1 and S2S_2?

Leaf order

When we make a dendrogram, there is a leaf node for each of the original data points. These leaf nodes are placed along the horizontal axis in a special order. True or false: given a particular choice of linkage distance, there is typically only one way the leaves can be ordered.

By hand

Consider this dissimilarity matrix.

Δ=(014210354301025100)\Delta=\begin{pmatrix} 0 &1 &4 & 2 \\ 1 &0 &3 & 5 \\ 4 &3& 0 & 10 \\ 2 & 5 & 10 & 0 \end{pmatrix}

Perform agglomerative clustering by hand using min linkage. Sketch the dendrogram (you can upload a picture of something sketched on paper). Be sure to clearly indicate height of each merge.