"trojan detection" Papers
2 papers found
Conference
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks
Maksym Andriushchenko, francesco croce, Nicolas Flammarion
ICLR 2025arXiv:2404.02151
401
citations
TrojanDec: Data-free Detection of Trojan Inputs in Self-supervised Learning
Yupei Liu, Yanting Wang, Jinyuan Jia
AAAI 2025paperarXiv:2501.04108