ResearchAlpha Leak

Conferences Topics Top Authors Rankings Browse All

Home/Authors/Han Zhong

Han Zhong

Topic trends: 32,543 papers · similarity ≥ 0.4 · year ≥ 2024 · Data sourced from Semantic Scholar

34,598 papers | Abstracts: 31,650 (91.5%) | Citations: 34,598 (100.0%) | arXiv: 26,074 (75.4%)

Built: Feb 14, 2026, 11:22 PM AMS

14

papers

650

total citations

papers (14)

Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint

Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment

Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage

NEURIPS 2023arXiv

A Theoretical Analysis of Optimistic Proximal Policy Optimization in Linear Markov Decision Processes

NEURIPS 2023arXiv

Why Robust Generalization in Deep Learning is Difficult: Perspective of Expressive Power

NEURIPS 2022arXiv

Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration

NEURIPS 2023arXiv

Tackling Heavy-Tailed Rewards in Reinforcement Learning with Function Approximation: Minimax Optimal and Instance-Dependent Regret Bounds

NEURIPS 2023arXiv

Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret

Breaking the Moments Condition Barrier: No-Regret Algorithm for Bandits with Super Heavy-Tailed Payoffs

NEURIPS 2021arXiv

A Reduction-based Framework for Sequential Decision Making with Delayed Feedback

NEURIPS 2023arXiv

BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning

Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond

A3S: A General Active Clustering Method with Pairwise Constraints

Posterior Sampling for Competitive RL: Function Approximation and Partial Observation

NEURIPS 2023arXiv