"llm security" Papers
4 papers found
Conference
AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs
Xiaogeng Liu, Peiran Li, G. Edward Suh et al.
ICLR 2025arXiv:2410.05295
115
citations
Can Watermarked LLMs be Identified by Users via Crafted Prompts?
Aiwei Liu, Sheng Guan, Yiming Liu et al.
ICLR 2025arXiv:2410.03168
12
citations
Persistent Pre-training Poisoning of LLMs
Yiming Zhang, Javier Rando, Ivan Evtimov et al.
ICLR 2025arXiv:2410.13722
38
citations
Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation
Danny Halawi, Alexander Wei, Eric Wallace et al.
ICML 2024arXiv:2406.20053
65
citations