Poster "model vulnerabilities" Papers
2 papers found
Conference
EFFICIENT JAILBREAK ATTACK SEQUENCES ON LARGE LANGUAGE MODELS VIA MULTI-ARMED BANDIT-BASED CONTEXT SWITCHING
Aditya Ramesh, Shivam Bhardwaj, Aditya Saibewar et al.
ICLR 2025
3
citations
VERA: Variational Inference Framework for Jailbreaking Large Language Models
Anamika Lochab, Lu Yan, Patrick Pynadath et al.
NEURIPS 2025arXiv:2506.22666
1
citations