by Yuxin Xiao Papers
2 papers found
Conference
KScope: A Framework for Characterizing the Knowledge Status of Language Models
Yuxin Xiao, Shan Chen, Jack Gallifant et al.
NEURIPS 2025arXiv:2506.07458
Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions
Yik Siu Chan, Narutatsu Ri, Yuxin Xiao et al.
ICML 2025arXiv:2502.04322
12
citations