Poster "transferable attacks" Papers
2 papers found
Conference
LARGO: Latent Adversarial Reflection through Gradient Optimization for Jailbreaking LLMs
Ran Li, Hao Wang, Chengzhi Mao
NEURIPS 2025arXiv:2505.10838
4
citations
The Good, the Bad and the Ugly: Meta-Analysis of Watermarks, Transferable Attacks and Adversarial Defenses
Greg Gluch, Berkant Turan, Sai Ganesh Nagarajan et al.
NEURIPS 2025arXiv:2410.08864