α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Weinan Zhang
Weinan Zhang
23
papers
1,464
total citations
papers (23)
Vision-Language Foundation Models as Effective Robot Imitators
ICLR 2024
arXiv
320
citations
AlphaZero-Like Tree-Search can Guide Large Language Model Decoding and Training
ICML 2024
arXiv
304
citations
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
NEURIPS 2022
arXiv
286
citations
Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning
NEURIPS 2023
arXiv
138
citations
NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning
NEURIPS 2022
arXiv
86
citations
Bootstrapped Transformer for Offline Reinforcement Learning
NEURIPS 2022
arXiv
48
citations
ReMA: Learning to Meta-Think for LLMs with Multi-agent Reinforcement Learning
NEURIPS 2025
arXiv
40
citations
Honor of Kings Arena: an Environment for Generalization in Competitive Reinforcement Learning
NEURIPS 2022
arXiv
40
citations
PerfectDou: Dominating DouDizhu with Perfect Information Distillation
NEURIPS 2022
arXiv
37
citations
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching
ICML 2024
arXiv
36
citations
Model-based Policy Optimization with Unsupervised Model Adaptation
NEURIPS 2020
arXiv
32
citations
On Effective Scheduling of Model-based Reinforcement Learning
NEURIPS 2021
arXiv
22
citations
Reinforcement Learning with Automated Auxiliary Loss Search
NEURIPS 2022
arXiv
18
citations
Lending Interaction Wings to Recommender Systems with Conversational Agents
NEURIPS 2023
arXiv
13
citations
Efficient Projection-free Algorithms for Saddle Point Problems
NEURIPS 2020
arXiv
13
citations
Learning Enhanced Representation for Tabular Data via Neighborhood Propagation
NEURIPS 2022
arXiv
9
citations
GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning
NEURIPS 2025
arXiv
6
citations
Autonomous Goal Detection and Cessation in Reinforcement Learning: A Case Study on Source Term Estimation
AAAI 2025
arXiv
5
citations
Score-Based Diffusion Policy Compatible with Reinforcement Learning via Optimal Transport
ICML 2025
arXiv
4
citations
Beyond Graph Convolution: Multimodal Recommendation with Topology-aware MLPs
AAAI 2025
arXiv
4
citations
Information-Theoretic Reward Decomposition for Generalizable RLHF
NEURIPS 2025
arXiv
3
citations
ContraDiff: Planning Towards High Return States via Contrastive Learning
ICLR 2025
0
citations
Curriculum Offline Imitating Learning
NEURIPS 2021
0
citations