α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Stuart J Russell
Stuart J Russell
4
papers
120
total citations
papers (4)
Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
ICLR 2024
arXiv
106
citations
Extractive Structures Learned in Pretraining Enable Generalization on Finetuned Facts
ICML 2025
arXiv
9
citations
BAMDP Shaping: a Unified Framework for Intrinsic Motivation and Reward Shaping
ICLR 2025
arXiv
4
citations
Observation Interference in Partially Observable Assistance Games
ICML 2025
arXiv
1
citations