α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Tinghao Xie
Tinghao Xie
4
papers
1,156
total citations
papers (4)
Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!
ICLR 2024
arXiv
966
citations
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal
ICLR 2025
arXiv
151
citations
Fantastic Copyrighted Beasts and How (Not) to Generate Them
ICLR 2025
arXiv
24
citations
BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection
ICLR 2024
arXiv
15
citations