Poster "offline reinforcement learning" Papers
75 papers found • Page 1 of 2
Conference
$q$-exponential family for policy optimization
Lingwei Zhu, Haseeb Shah, Han Wang et al.
Adaptable Safe Policy Learning from Multi-task Data with Constraint Prioritized Decision Transformer
Ruiqi Xue, Ziqian Zhang, Lihe Li et al.
ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning
Zeyuan Liu, Zhihe Yang, Jiawei Xu et al.
Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning
Hyungkyu Kang, Min-hwan Oh
Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning
Jifeng Hu, Sili Huang, Zhejian Yang et al.
ContraDiff: Planning Towards High Return States via Contrastive Learning
Yixiang Shan, Zhengbang Zhu, Ting Long et al.
DoF: A Diffusion Factorization Framework for Offline Multi-Agent Reinforcement Learning
Chao Li, Ziwei Deng, Chenxing Lin et al.
Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data
Zhiyuan Zhou, Andy Peng, Qiyang Li et al.
Energy-Weighted Flow Matching for Offline Reinforcement Learning
Shiyuan Zhang, Weitong Zhang, Quanquan Gu
Fat-to-Thin Policy Optimization: Offline Reinforcement Learning with Sparse Policies
Lingwei Zhu, Han Wang, Yukie Nagai
Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
Yiqin Yang, Quanwei Wang, Chenghao Li et al.
Finite-Time Bounds for Average-Reward Fitted Q-Iteration
Jongmin Lee, Ernest Ryu
FOSP: Fine-tuning Offline Safe Policy through World Models
Chenyang Cao, Yucheng Xin, Silang Wu et al.
GOPlan: Goal-conditioned Offline Reinforcement Learning by Planning with Learned Models
Mianchu Wang, Rui Yang, Xi Chen et al.
Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models
Uladzislau Sobal, Wancong Zhang, Kyunghyun Cho et al.
Learning on One Mode: Addressing Multi-modality in Offline Reinforcement Learning
Mianchu Wang, Yue Jin, Giovanni Montana
Learning Preferences without Interaction for Cooperative AI: A Hybrid Offline-Online Approach
Haitong Ma, Haoran Yu, Haobo Fu et al.
Local Manifold Approximation and Projection for Manifold-Aware Diffusion Planning
Kyowoon Lee, Jaesik Choi
Mitigating Reward Over-optimization in Direct Alignment Algorithms with Importance Sampling
Nguyen Phuc, Ngoc-Hieu Nguyen, Duy M. H. Nguyen et al.
Model-based Offline Reinforcement Learning with Lower Expectile Q-Learning
Kwanyoung Park, Youngwoon Lee
Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds
Zhiyong Wang, Dongruo Zhou, John C.S. Lui et al.
Model-Free Offline Reinforcement Learning with Enhanced Robustness
Chi Zhang, Zain Ulabedeen Farhat, George Atia et al.
Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
Pai Liu, Lingfeng Zhao, Shivangi Agarwal et al.
MOSDT: Self-Distillation-Based Decision Transformer for Multi-Agent Offline Safe Reinforcement Learning
Yuchen Xia, Yunjian Xu
Neural Stochastic Differential Equations for Uncertainty-Aware Offline RL
Cevahir Koprulu, Franck Djeumou, ufuk topcu
Offline RL by Reward-Weighted Fine-Tuning for Conversation Optimization
Subhojyoti Mukherjee, Viet Lai, Raghavendra Addanki et al.
Offline RL in Regular Decision Processes: Sample Efficiency via Language Metrics
Ahana Deb, Roberto Cipollone, Anders Jonsson et al.
OGBench: Benchmarking Offline Goal-Conditioned RL
Seohong Park, Kevin Frans, Benjamin Eysenbach et al.
Online Optimization for Offline Safe Reinforcement Learning
Yassine Chemingui, Aryan Deshwal, Alan Fern et al.
Preference Elicitation for Offline Reinforcement Learning
Alizée Pace, Bernhard Schölkopf, Gunnar Ratsch et al.
Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning
Jongchan Park, Mingyu Park, Donghwan Lee
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning
Joey Hong, Anca Dragan, Sergey Levine
Rebalancing Return Coverage for Conditional Sequence Modeling in Offline Reinforcement Learning
Wensong Bai, Chufan Chen, Yichao Fu et al.
REINFORCEMENT LEARNING FOR INDIVIDUAL OPTIMAL POLICY FROM HETEROGENEOUS DATA
Rui Miao, Babak Shahbaba, Annie Qu
RLZero: Direct Policy Inference from Language Without In-Domain Supervision
Harshit Sushil Sikchi, Siddhant Agarwal, Pranaya Jajoo et al.
RTDiff: Reverse Trajectory Synthesis via Diffusion for Offline Reinforcement Learning
Qianlan Yang, Yu-Xiong Wang
Scrutinize What We Ignore: Reining In Task Representation Shift Of Context-Based Offline Meta Reinforcement Learning
Hai Zhang, Boyuan Zheng, Tianying Ji et al.
Value-aligned Behavior Cloning for Offline Reinforcement Learning via Bi-level Optimization
Xingyu Jiang, Ning Gao, Xiuhui Zhang et al.
Value-Guided Decision Transformer: A Unified Reinforcement Learning Framework for Online and Offline Settings
Hongling Zheng, Li Shen, Yong Luo et al.
What Makes a Good Diffusion Planner for Decision Making?
Haofei Lu, Dongqi Han, Yifei Shen et al.
Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning
Tenglong Liu, Yang Li, Yixing Lan et al.
A Primal-Dual Algorithm for Offline Constrained Reinforcement Learning with Linear MDPs
Kihyuk Hong, Ambuj Tewari
ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories
Qianlan Yang, Yu-Xiong Wang
Bayesian Design Principles for Offline-to-Online Reinforcement Learning
Hao Hu, yiqin yang, Jianing Ye et al.
Causal Action Influence Aware Counterfactual Data Augmentation
Núria Armengol Urpí, Marco Bagatella, Marin Vlastelica et al.
Confidence Aware Inverse Constrained Reinforcement Learning
Sriram Ganapathi Subramanian, Guiliang Liu, Mohammed Elmahgiubi et al.
Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning
Xiaoyu Wen, Chenjia Bai, Kang Xu et al.
Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics
Xinyu Zhang, Wenjie Qiu, Yi-Chen Li et al.
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching
Guanghe Li, Yixiang Shan, Zhengbang Zhu et al.
Discovering Multiple Solutions from a Single Task in Offline Reinforcement Learning
Takayuki Osa, Tatsuya Harada