"prompt injection attacks" Papers
3 papers found
Conference
DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents
Hao Li, Xiaogeng Liu, CHIU Chun et al.
NEURIPS 2025arXiv:2506.12104
15
citations
OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents
Thomas Kuntz, Agatha Duzan, Hao Zhao et al.
NEURIPS 2025spotlightarXiv:2506.14866
25
citations
WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks
Ivan Evtimov, Arman Zharmagambetov, Aaron Grattafiori et al.
NEURIPS 2025arXiv:2504.18575
54
citations