α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Arthur Conmy
Arthur Conmy
4
papers
681
total citations
papers (4)
Towards Automated Circuit Discovery for Mechanistic Interpretability
NEURIPS 2023
arXiv
473
citations
Stealing part of a production language model
ICML 2024
arXiv
145
citations
SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability
ICML 2025
arXiv
58
citations
Scaling Sparse Feature Circuits For Studying In-Context Learning
ICML 2025
5
citations