α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Javier Rando
Javier Rando
1
Affiliations
Affiliations
ETH Zurich
6
papers
958
total citations
papers (6)
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
ICLR 2025
arXiv
750
citations
Universal Jailbreak Backdoors from Poisoned Human Feedback
ICLR 2024
arXiv
114
citations
Persistent Pre-training Poisoning of LLMs
ICLR 2025
arXiv
38
citations
Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI
ICLR 2025
arXiv
35
citations
Measuring Non-Adversarial Reproduction of Training Data in Large Language Models
ICLR 2025
arXiv
13
citations
AutoAdvExBench: Benchmarking Autonomous Exploitation of Adversarial Example Defenses
ICML 2025
arXiv
8
citations