Poster "adversarial prompting" Papers
4 papers found
Conference
AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs
Anselm Paulus, Arman Zharmagambetov, Chuan Guo et al.
ICML 2025arXiv:2404.16873
132
citations
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks
Maksym Andriushchenko, francesco croce, Nicolas Flammarion
ICLR 2025arXiv:2404.02151
401
citations
ProAdvPrompter: A Two-Stage Journey to Effective Adversarial Prompting for LLMs
Hao Di, Tong He, Haishan Ye et al.
ICLR 2025
2
citations
The Right to Red-Team: Adversarial AI Literacy as a Civic Imperative in K-12 Education
Devan Walton, Haesol Bae
NEURIPS 2025