α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Jacob Andreas
Jacob Andreas
2
Affiliations
Affiliations
Microsoft
MIT
16
papers
1,438
total citations
papers (16)
Pre-Trained Language Models for Interactive Decision-Making
NEURIPS 2022
arXiv
308
citations
Compositional Explanations of Neurons
NEURIPS 2020
arXiv
204
citations
A Benchmark for Systematic Generalization in Grounded Language Understanding
NEURIPS 2020
arXiv
152
citations
Linearity of Relation Decoding in Transformer Language Models
ICLR 2024
arXiv
143
citations
The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks
NEURIPS 2023
arXiv
142
citations
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling
ICML 2024
arXiv
101
citations
In-Context Language Learning: Architectures and Algorithms
ICML 2024
arXiv
83
citations
Eliciting Human Preferences with Language Models
ICLR 2025
arXiv
79
citations
Learning How Hard to Think: Input-Adaptive Allocation of LM Computation
ICLR 2025
arXiv
47
citations
A Multimodal Automated Interpretability Agent
ICML 2024
arXiv
45
citations
The Surprising Effectiveness of Test-Time Training for Few-Shot Learning
ICML 2025
arXiv
45
citations
The Consensus Game: Language Model Generation via Equilibrium Search
ICLR 2024
arXiv
35
citations
FIND: A Function Description Benchmark for Evaluating Interpretability Methods
NEURIPS 2023
arXiv
31
citations
Toward a Visual Concept Vocabulary for GAN Latent Space
ICCV 2021
arXiv
16
citations
Learning Linear Attention in Polynomial Time
NEURIPS 2025
arXiv
4
citations
Teachable Reinforcement Learning via Advice Distillation
NEURIPS 2021
arXiv
3
citations