α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Dawn Song
Dawn Song
27
papers
7,353
total citations
papers (27)
The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization
ICCV 2021
arXiv
2,156
citations
Natural Adversarial Examples
CVPR 2021
arXiv
1,783
citations
Model-Contrastive Federated Learning
CVPR 2021
arXiv
1,491
citations
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
NEURIPS 2023
arXiv
571
citations
The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks
CVPR 2020
arXiv
488
citations
PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures
CVPR 2022
arXiv
174
citations
Compositional Generalization via Neural-Symbolic Stack Machines
NEURIPS 2020
arXiv
106
citations
TrojDiff: Trojan Attacks on Diffusion Models With Diverse Targets
CVPR 2023
arXiv
104
citations
RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
ICML 2024
arXiv
67
citations
Synthesize, Execute and Debug: Learning to Repair for Neural Program Synthesis
NEURIPS 2020
arXiv
60
citations
DiffAttack: Evasion Attacks Against Diffusion-Based Adversarial Purification
NEURIPS 2023
arXiv
50
citations
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
ICML 2024
arXiv
49
citations
Data Shapley in One Training Run
ICLR 2025
arXiv
48
citations
Agent Instructs Large Language Models to be General Zero-Shot Reasoners
ICML 2024
arXiv
40
citations
Forecasting Future World Events With Neural Networks
NEURIPS 2022
arXiv
39
citations
CodeHalu: Investigating Code Hallucinations in LLMs via Execution-based Verification
AAAI 2025
arXiv
31
citations
C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models
ICML 2024
arXiv
31
citations
Scalability vs. Utility: Do We Have To Sacrifice One for the Other in Data Importance Quantification?
CVPR 2021
arXiv
21
citations
Towards practical differentially private causal graph discovery
NEURIPS 2020
arXiv
16
citations
How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios
NEURIPS 2022
arXiv
11
citations
Adversarial Examples for k-Nearest Neighbor Classifiers Based on Higher-Order Voronoi Diagrams
NEURIPS 2021
arXiv
10
citations
GRATH: Gradual Self-Truthifying for Large Language Models
ICML 2024
arXiv
7
citations
Position: Evolving AI Collectives Enhance Human Diversity and Enable Self-Regulation
ICML 2024
0
citations
Latent Execution for Neural Program Synthesis Beyond Domain-Specific Languages
NEURIPS 2021
0
citations
BIRD: Generalizable Backdoor Detection and Removal for Deep Reinforcement Learning
NEURIPS 2023
0
citations
SHINE: Shielding Backdoors in Deep Reinforcement Learning
ICML 2024
0
citations
Position: On the Societal Impact of Open Foundation Models
ICML 2024
0
citations