Poster "interpretability methods" Papers
7 papers found
Conference
$\text{I}^2\text{AM}$: Interpreting Image-to-Image Latent Diffusion Models via Bi-Attribution Maps
Junseo Park, Hyeryung Jang
ICLR 2025
Concept-Guided Interpretability via Neural Chunking
Shuchen Wu, Stephan Alaniz, Shyamgopal Karthik et al.
NEURIPS 2025arXiv:2505.11576
GCAV: A Global Concept Activation Vector Framework for Cross-Layer Consistency in Interpretability
Zhenghao He, Sanchit Sinha, Guangzhi Xiong et al.
ICCV 2025arXiv:2508.21197
Layer-Wise Modality Decomposition for Interpretable Multimodal Sensor Fusion
Jaehyun Park, Konyul Park, Daehun Kim et al.
NEURIPS 2025arXiv:2511.00859
Residual Stream Analysis with Multi-Layer SAEs
Tim Lawson, Lucy Farnik, Conor Houghton et al.
ICLR 2025arXiv:2409.04185
11
citations
Listenable Maps for Audio Classifiers
Francesco Paissan, Mirco Ravanelli, Cem Subakan
ICML 2024arXiv:2403.13086
13
citations
SPADE: Sparsity-Guided Debugging for Deep Neural Networks
Arshia Soltani Moakhar, Eugenia Iofinova, Elias Frantar et al.
ICML 2024arXiv:2310.04519
2
citations