α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Xuwang Yin
Xuwang Yin
3
papers
849
total citations
papers (3)
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
ICML 2024
arXiv
802
citations
Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs
NEURIPS 2025
arXiv
36
citations
Learning Energy-Based Models with Adversarial Training
ECCV 2022
arXiv
11
citations