α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Huazheng Wang
Huazheng Wang
1
Affiliations
Affiliations
Beijing University of Posts and Telecommunications
10
papers
78
total citations
papers (10)
PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback
ICLR 2024
arXiv
38
citations
Communication Efficient Distributed Learning for Kernelized Contextual Bandits
NEURIPS 2022
arXiv
17
citations
Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective
NEURIPS 2023
arXiv
7
citations
A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement
ICLR 2025
arXiv
5
citations
Tree Search-Based Evolutionary Bandits for Protein Sequence Optimization
AAAI 2024
arXiv
3
citations
Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization
NEURIPS 2022
arXiv
3
citations
Adversarial Attacks on Combinatorial Multi-Armed Bandits
ICML 2024
arXiv
3
citations
Stealthy Adversarial Attacks on Stochastic Multi-Armed Bandits
AAAI 2024
arXiv
1
citations
FCOM: A Federated Collaborative Online Monitoring Framework via Representation Learning
AAAI 2025
arXiv
1
citations
Provably Efficient Algorithm for Best Scoring Rule Identification in Online Principal-Agent Information Acquisition
ICML 2025
arXiv
0
citations