"online reinforcement learning" Papers

13 papers found

A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning

Yuzheng Hu, Fan Wu, Haotian Ye et al.

NEURIPS 2025oralarXiv:2505.19281
3
citations

Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data

Zhiyuan Zhou, Andy Peng, Qiyang Li et al.

ICLR 2025arXiv:2412.07762
30
citations

Flow-Based Policy for Online Reinforcement Learning

Lei Lv, Yunfei Li, Yu Luo et al.

NEURIPS 2025arXiv:2506.12811
12
citations

Learning Preferences without Interaction for Cooperative AI: A Hybrid Offline-Online Approach

Haitong Ma, Haoran Yu, Haobo Fu et al.

NEURIPS 2025

Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits

Fan Chen, Zeyu Jia, Alexander Rakhlin et al.

NEURIPS 2025arXiv:2505.20268
4
citations

Prioritized Generative Replay

Ren Wang, Kevin Frans, Pieter Abbeel et al.

ICLR 2025arXiv:2410.18082
9
citations

Training Language Models to Self-Correct via Reinforcement Learning

Aviral Kumar, Vincent Zhuang, Rishabh Agarwal et al.

ICLR 2025arXiv:2409.12917
324
citations

Value-Guided Decision Transformer: A Unified Reinforcement Learning Framework for Online and Offline Settings

Hongling Zheng, Li Shen, Yong Luo et al.

NEURIPS 2025

Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces

Brahma Pavse, Matthew Zurek, Yudong Chen et al.

ICML 2024arXiv:2306.01896
3
citations

Model-Free Robust $\phi$-Divergence Reinforcement Learning Using Both Offline and Online Data

Kishan Panaganti, Adam Wierman, Eric Mazumdar

ICML 2024

Pausing Policy Learning in Non-stationary Reinforcement Learning

Hyunin Lee, Ming Jin, Javad Lavaei et al.

ICML 2024oralarXiv:2405.16053
2
citations

Scalable Online Exploration via Coverability

Philip Amortila, Dylan Foster, Akshay Krishnamurthy

ICML 2024arXiv:2403.06571
9
citations

Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption

Chenlu Ye, Jiafan He, Quanquan Gu et al.

ICML 2024arXiv:2402.08991
10
citations