Poster "offline reinforcement learning" Papers

75 papers found • Page 1 of 2

$q$-exponential family for policy optimization

Lingwei Zhu, Haseeb Shah, Han Wang et al.

ICLR 2025arXiv:2408.07245
2
citations

Adaptable Safe Policy Learning from Multi-task Data with Constraint Prioritized Decision Transformer

Ruiqi Xue, Ziqian Zhang, Lihe Li et al.

NEURIPS 2025

ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning

Zeyuan Liu, Zhihe Yang, Jiawei Xu et al.

NEURIPS 2025arXiv:2505.23871
2
citations

Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning

Hyungkyu Kang, Min-hwan Oh

ICLR 2025arXiv:2503.05306
3
citations

Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning

Jifeng Hu, Sili Huang, Zhejian Yang et al.

NEURIPS 2025arXiv:2505.01822

ContraDiff: Planning Towards High Return States via Contrastive Learning

Yixiang Shan, Zhengbang Zhu, Ting Long et al.

ICLR 2025

DoF: A Diffusion Factorization Framework for Offline Multi-Agent Reinforcement Learning

Chao Li, Ziwei Deng, Chenxing Lin et al.

ICLR 2025
7
citations

Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data

Zhiyuan Zhou, Andy Peng, Qiyang Li et al.

ICLR 2025arXiv:2412.07762
30
citations

Energy-Weighted Flow Matching for Offline Reinforcement Learning

Shiyuan Zhang, Weitong Zhang, Quanquan Gu

ICLR 2025arXiv:2503.04975
29
citations

Fat-to-Thin Policy Optimization: Offline Reinforcement Learning with Sparse Policies

Lingwei Zhu, Han Wang, Yukie Nagai

ICLR 2025

Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset

Yiqin Yang, Quanwei Wang, Chenghao Li et al.

ICLR 2025arXiv:2502.18955

Finite-Time Bounds for Average-Reward Fitted Q-Iteration

Jongmin Lee, Ernest Ryu

NEURIPS 2025arXiv:2510.17391

FOSP: Fine-tuning Offline Safe Policy through World Models

Chenyang Cao, Yucheng Xin, Silang Wu et al.

ICLR 2025arXiv:2407.04942
3
citations

GOPlan: Goal-conditioned Offline Reinforcement Learning by Planning with Learned Models

Mianchu Wang, Rui Yang, Xi Chen et al.

ICLR 2025arXiv:2310.20025
16
citations

Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models

Uladzislau Sobal, Wancong Zhang, Kyunghyun Cho et al.

NEURIPS 2025arXiv:2502.14819
20
citations

Learning on One Mode: Addressing Multi-modality in Offline Reinforcement Learning

Mianchu Wang, Yue Jin, Giovanni Montana

ICLR 2025arXiv:2412.03258
2
citations

Learning Preferences without Interaction for Cooperative AI: A Hybrid Offline-Online Approach

Haitong Ma, Haoran Yu, Haobo Fu et al.

NEURIPS 2025

Local Manifold Approximation and Projection for Manifold-Aware Diffusion Planning

Kyowoon Lee, Jaesik Choi

ICML 2025arXiv:2506.00867
4
citations

Mitigating Reward Over-optimization in Direct Alignment Algorithms with Importance Sampling

Nguyen Phuc, Ngoc-Hieu Nguyen, Duy M. H. Nguyen et al.

NEURIPS 2025arXiv:2506.08681

Model-based Offline Reinforcement Learning with Lower Expectile Q-Learning

Kwanyoung Park, Youngwoon Lee

ICLR 2025arXiv:2407.00699
5
citations

Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds

Zhiyong Wang, Dongruo Zhou, John C.S. Lui et al.

ICLR 2025arXiv:2408.08994
10
citations

Model-Free Offline Reinforcement Learning with Enhanced Robustness

Chi Zhang, Zain Ulabedeen Farhat, George Atia et al.

ICLR 2025
5
citations

Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol

Pai Liu, Lingfeng Zhao, Shivangi Agarwal et al.

NEURIPS 2025arXiv:2502.08021
4
citations

MOSDT: Self-Distillation-Based Decision Transformer for Multi-Agent Offline Safe Reinforcement Learning

Yuchen Xia, Yunjian Xu

NEURIPS 2025

Neural Stochastic Differential Equations for Uncertainty-Aware Offline RL

Cevahir Koprulu, Franck Djeumou, ufuk topcu

ICLR 2025

Offline RL by Reward-Weighted Fine-Tuning for Conversation Optimization

Subhojyoti Mukherjee, Viet Lai, Raghavendra Addanki et al.

NEURIPS 2025arXiv:2506.06964
3
citations

Offline RL in Regular Decision Processes: Sample Efficiency via Language Metrics

Ahana Deb, Roberto Cipollone, Anders Jonsson et al.

ICLR 2025

OGBench: Benchmarking Offline Goal-Conditioned RL

Seohong Park, Kevin Frans, Benjamin Eysenbach et al.

ICLR 2025arXiv:2410.20092
90
citations

Online Optimization for Offline Safe Reinforcement Learning

Yassine Chemingui, Aryan Deshwal, Alan Fern et al.

NEURIPS 2025arXiv:2510.22027

Preference Elicitation for Offline Reinforcement Learning

Alizée Pace, Bernhard Schölkopf, Gunnar Ratsch et al.

ICLR 2025arXiv:2406.18450
2
citations

Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning

Jongchan Park, Mingyu Park, Donghwan Lee

NEURIPS 2025arXiv:2505.05701
1
citations

Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning

Joey Hong, Anca Dragan, Sergey Levine

ICLR 2025arXiv:2411.05193
8
citations

Rebalancing Return Coverage for Conditional Sequence Modeling in Offline Reinforcement Learning

Wensong Bai, Chufan Chen, Yichao Fu et al.

NEURIPS 2025

REINFORCEMENT LEARNING FOR INDIVIDUAL OPTIMAL POLICY FROM HETEROGENEOUS DATA

Rui Miao, Babak Shahbaba, Annie Qu

NEURIPS 2025arXiv:2505.09496
1
citations

RLZero: Direct Policy Inference from Language Without In-Domain Supervision

Harshit Sushil Sikchi, Siddhant Agarwal, Pranaya Jajoo et al.

NEURIPS 2025arXiv:2412.05718
3
citations

RTDiff: Reverse Trajectory Synthesis via Diffusion for Offline Reinforcement Learning

Qianlan Yang, Yu-Xiong Wang

ICLR 2025

Scrutinize What We Ignore: Reining In Task Representation Shift Of Context-Based Offline Meta Reinforcement Learning

Hai Zhang, Boyuan Zheng, Tianying Ji et al.

ICLR 2025arXiv:2405.12001

Value-aligned Behavior Cloning for Offline Reinforcement Learning via Bi-level Optimization

Xingyu Jiang, Ning Gao, Xiuhui Zhang et al.

ICLR 2025

Value-Guided Decision Transformer: A Unified Reinforcement Learning Framework for Online and Offline Settings

Hongling Zheng, Li Shen, Yong Luo et al.

NEURIPS 2025

What Makes a Good Diffusion Planner for Decision Making?

Haofei Lu, Dongqi Han, Yifei Shen et al.

ICLR 2025arXiv:2503.00535
27
citations

Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning

Tenglong Liu, Yang Li, Yixing Lan et al.

ICML 2024arXiv:2405.19909
13
citations

A Primal-Dual Algorithm for Offline Constrained Reinforcement Learning with Linear MDPs

Kihyuk Hong, Ambuj Tewari

ICML 2024arXiv:2402.04493
4
citations

ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories

Qianlan Yang, Yu-Xiong Wang

ICML 2024arXiv:2406.04323
1
citations

Bayesian Design Principles for Offline-to-Online Reinforcement Learning

Hao Hu, yiqin yang, Jianing Ye et al.

ICML 2024arXiv:2405.20984
8
citations

Causal Action Influence Aware Counterfactual Data Augmentation

Núria Armengol Urpí, Marco Bagatella, Marin Vlastelica et al.

ICML 2024arXiv:2405.18917
11
citations

Confidence Aware Inverse Constrained Reinforcement Learning

Sriram Ganapathi Subramanian, Guiliang Liu, Mohammed Elmahgiubi et al.

ICML 2024arXiv:2406.16782
5
citations

Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning

Xiaoyu Wen, Chenjia Bai, Kang Xu et al.

ICML 2024arXiv:2405.06192
10
citations

Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics

Xinyu Zhang, Wenjie Qiu, Yi-Chen Li et al.

ICML 2024arXiv:2402.11317
1
citations

DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching

Guanghe Li, Yixiang Shan, Zhengbang Zhu et al.

ICML 2024arXiv:2402.02439
36
citations

Discovering Multiple Solutions from a Single Task in Offline Reinforcement Learning

Takayuki Osa, Tatsuya Harada

ICML 2024arXiv:2406.05993
3
citations
PreviousNext