Asymptotic Theory of Geometric and Adaptive $k$-Means Clustering
Top Authors
Abstract
We revisit Pollard's classical result on consistency for $k$-means clustering in Euclidean space, with a focus on extensions in two directions: first, to problems where the data may come from interesting geometric settings (e.g., Riemannian manifolds, reflexive Banach spaces, or the Wasserstein space); second, to problems where some parameters are chosen adaptively from the data (e.g., $k$-medoids or elbow-method $k$-means). Towards this end, we provide a general theory which shows that all clustering procedures described above are strongly consistent. In fact, our method of proof allows us to derive many asymptotic limit theorems beyond strong consistency. We also remove all assumptions about uniqueness of the set of optimal cluster centers.