Poster "neural network interpretability" Papers

11 papers found

FACE: Faithful Automatic Concept Extraction

Dipkamal Bhusal, Michael Clifford, Sara Rampazzi et al.

NEURIPS 2025arXiv:2510.11675
3
citations

From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit

Valérie Costa, Thomas Fel, Ekdeep S Lubana et al.

NEURIPS 2025arXiv:2506.03093
15
citations

Inner Information Analysis Algorithm for Deep Neural Network based on Community

Guipeng Lan, Shuai Xiao, Meng Xi et al.

ICLR 2025
2
citations

Interpreting Emergent Features in Deep Learning-based Side-channel Analysis

Sengim Karayalcin, Marina Krček, Stjepan Picek

NEURIPS 2025arXiv:2502.00384

The Computational Complexity of Circuit Discovery for Inner Interpretability

Federico Adolfi, Martina G. Vilas, Todd Wareham

ICLR 2025arXiv:2410.08025
11
citations

VITAL: More Understandable Feature Visualization through Distribution Alignment and Relevant Information Flow

Ada Görgün, Bernt Schiele, Jonas Fischer

ICCV 2025arXiv:2503.22399
1
citations

Exploring the Low-Pass Filtering Behavior in Image Super-Resolution

Haoyu Deng, Zijing Xu, Yule Duan et al.

ICML 2024arXiv:2405.07919
10
citations

From Neurons to Neutrons: A Case Study in Interpretability

Ouail Kitouni, Niklas Nolte, Víctor Samuel Pérez-Díaz et al.

ICML 2024arXiv:2405.17425
4
citations

Grokking Group Multiplication with Cosets

Dashiell Stander, Qinan Yu, Honglu Fan et al.

ICML 2024arXiv:2312.06581
17
citations

Layerwise Change of Knowledge in Neural Networks

Xu Cheng, Lei Cheng, Zhaoran Peng et al.

ICML 2024arXiv:2409.08712
7
citations

Layer-Wise Relevance Propagation with Conservation Property for ResNet

Seitaro Otsuki, Tsumugi Iida, Félix Doublet et al.

ECCV 2024arXiv:2407.09115
10
citations