Poster "neural network interpretability" Papers
11 papers found
Conference
FACE: Faithful Automatic Concept Extraction
Dipkamal Bhusal, Michael Clifford, Sara Rampazzi et al.
NEURIPS 2025arXiv:2510.11675
3
citations
From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit
Valérie Costa, Thomas Fel, Ekdeep S Lubana et al.
NEURIPS 2025arXiv:2506.03093
15
citations
Inner Information Analysis Algorithm for Deep Neural Network based on Community
Guipeng Lan, Shuai Xiao, Meng Xi et al.
ICLR 2025
2
citations
Interpreting Emergent Features in Deep Learning-based Side-channel Analysis
Sengim Karayalcin, Marina Krček, Stjepan Picek
NEURIPS 2025arXiv:2502.00384
The Computational Complexity of Circuit Discovery for Inner Interpretability
Federico Adolfi, Martina G. Vilas, Todd Wareham
ICLR 2025arXiv:2410.08025
11
citations
VITAL: More Understandable Feature Visualization through Distribution Alignment and Relevant Information Flow
Ada Görgün, Bernt Schiele, Jonas Fischer
ICCV 2025arXiv:2503.22399
1
citations
Exploring the Low-Pass Filtering Behavior in Image Super-Resolution
Haoyu Deng, Zijing Xu, Yule Duan et al.
ICML 2024arXiv:2405.07919
10
citations
From Neurons to Neutrons: A Case Study in Interpretability
Ouail Kitouni, Niklas Nolte, Víctor Samuel Pérez-Díaz et al.
ICML 2024arXiv:2405.17425
4
citations
Grokking Group Multiplication with Cosets
Dashiell Stander, Qinan Yu, Honglu Fan et al.
ICML 2024arXiv:2312.06581
17
citations
Layerwise Change of Knowledge in Neural Networks
Xu Cheng, Lei Cheng, Zhaoran Peng et al.
ICML 2024arXiv:2409.08712
7
citations
Layer-Wise Relevance Propagation with Conservation Property for ResNet
Seitaro Otsuki, Tsumugi Iida, Félix Doublet et al.
ECCV 2024arXiv:2407.09115
10
citations