"neuron interpretability" Papers

3 papers found

Filters:neuron interpretability Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

Monet: Mixture of Monosemantic Experts for Transformers

Jungwoo Park, Young Jin Ahn, Kee-Eung Kim et al.

ICLR 2025arXiv:2412.04139

What should a neuron aim for? Designing local objective functions based on information theory

Andreas C. Schneider, Valentin Neuhaus, David Ehrlich et al.

ICLR 2025arXiv:2412.02482

Linear Explanations for Individual Neurons

Tuomas Oikarinen, Lily Weng

ICML 2024arXiv:2405.06855