α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Rishub Tamirisa
Rishub Tamirisa
4
papers
519
total citations
papers (4)
The WMDP Benchmark: Measuring and Reducing Malicious Use with Unlearning
ICML 2024
arXiv
333
citations
Tamper-Resistant Safeguards for Open-Weight LLMs
ICLR 2025
arXiv
113
citations
FedSelect: Personalized Federated Learning with Customized Selection of Parameters for Fine-Tuning
CVPR 2024
arXiv
37
citations
Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs
NEURIPS 2025
arXiv
36
citations