α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Yangsibo Huang
Yangsibo Huang
12
papers
1,449
total citations
papers (12)
Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation
ICLR 2024
arXiv
430
citations
Evaluating Gradient Inversion Attacks and Defenses in Federated Learning
NEURIPS 2021
arXiv
357
citations
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
ICML 2024
arXiv
184
citations
MUSE: Machine Unlearning Six-Way Evaluation for Language Models
ICLR 2025
arXiv
168
citations
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal
ICLR 2025
arXiv
151
citations
Recovering Private Text in Federated Learning of Language Models
NEURIPS 2022
arXiv
98
citations
Fantastic Copyrighted Beasts and How (Not) to Generate Them
ICLR 2025
arXiv
24
citations
Scaling Laws for Differentially Private Language Models
ICML 2025
arXiv
12
citations
Scaling Embedding Layers in Language Models
NEURIPS 2025
arXiv
11
citations
Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models
COLM 2025
arXiv
9
citations
Sparsity-Preserving Differentially Private Training of Large Embedding Models
NEURIPS 2023
arXiv
5
citations
Position: A Safe Harbor for AI Evaluation and Red Teaming
ICML 2024
0
citations