I Am Big, You Are Little; I Am Right, You Are Wrong

3citations

arXiv:2507.23509

citations

#875

in ICCV 2025

of 2701 papers

Top Authors

Data Points

Top Authors

David A Kelly Akchunya Chanchal Nathan Blake

Topics

image classification model interpretability minimal sufficient pixels decision-making process vision model analysis pixel set concentration model architecture comparison misclassification analysis

Abstract

Machine learning for image classification is an active and rapidly developing field. With the proliferation of classifiers of different sizes and different architectures, the problem of choosing the right model becomes more and more important. While we can assess a model's classification accuracy statistically, our understanding of the way these models work is unfortunately limited. In order to gain insight into the decision-making process of different vision models, we propose using minimal sufficient pixels sets to gauge a model's `concentration': the pixels that capture the essence of an image through the lens of the model. By comparing position, overlap, and size of sets of pixels, we identify that different architectures have statistically different concentration, in both size and position. In particular, ConvNext and EVA models differ markedly from the others. We also identify that images which are misclassified are associated with larger pixels sets than correct classifications.

Citation History

Jan 24, 2026

Jan 26, 2026

3+3

Jan 27, 2026

Feb 3, 2026

Feb 13, 2026