"trigger detection" Papers
3 papers found
Conference
DeDe: Detecting Backdoor Samples for SSL Encoders via Decoders
Sizai Hou, Songze Li, Duanyi Yao
CVPR 2025arXiv:2411.16154
2
citations
Probe before You Talk: Towards Black-box Defense against Backdoor Unalignment for Large Language Models
Biao Yi, Tiansheng Huang, Sishuo Chen et al.
ICLR 2025arXiv:2506.16447
23
citations
Causality Based Front-door Defense Against Backdoor Attack on Language Models
Yiran Liu, Xiaoang Xu, Zhiyi Hou et al.
ICML 2024