Skip to main content
  1. Posts/

Spectral Clustering: How Eigenvectors Reveal Complex Cluster Structures

··255 words·2 mins·

🔵 K-means fails with curved clusters. Spectral Clustering solves them using eigenvectors of the similarity graph.

The problem with K-means:

On moon-shaped data (or any non-linear structure), K-means incorrectly groups points because it assumes spherical clusters.

The Spectral Clustering workflow:

Dataset → Similarity Graph → Laplacian → Eigenvectors → Clusters

The 7 steps:

  1. Get data
  2. Similarity matrix (W) → measures similarity between each pair of points (Gaussian kernel with gamma parameter)
  3. Degree matrix (D) → diagonal: sum of similarities of each point with all others
  4. Laplacian (L = D − W) → captures the structure of the graph
  5. Eigendecomposition of Leigenvalues, eigenvectors = np.linalg.eigh(L)
  6. Select the k smallest eigenvectors → correspond to k clusters (eigengap heuristic: pick where the largest jump occurs between successive eigenvalues)
  7. Apply K-means on the spectral embedding space → now clusters are linearly separable

Why it works:

The eigenvectors of the Laplacian transform data into a new space where strongly-connected points end up close together. K-means works there, even when it fails in the original space.

💡 Explanation in a nutshell
#

Spectral Clustering doesn’t group by Euclidean distance but by connectivity in a similarity graph. The elegant trick: the Laplacian’s eigenvectors act as new features where complex structures become linearly separable. The combination of linear algebra + K-means allows detecting clusters that no distance-based method could find.

More information at the link 👇

Also published on LinkedIn.
Juan Pedro Bretti Mandarano
Author
Juan Pedro Bretti Mandarano