"safe reinforcement learning" Papers
18 papers found
Conference
Adaptable Safe Policy Learning from Multi-task Data with Constraint Prioritized Decision Transformer
Ruiqi Xue, Ziqian Zhang, Lihe Li et al.
Alignment of Large Language Models with Constrained Learning
Botong Zhang, Shuo Li, Ignacio Hounie et al.
Explainably Safe Reinforcement Learning
Sabine Rieder, Stefan Pranger, Debraj Chakraborty et al.
Fuz-RL: A Fuzzy-Guided Robust Framework for Safe Reinforcement Learning under Uncertainty
Xu Wan, Chao Yang, Cheng Yang et al.
HASARD: A Benchmark for Vision-Based Safe Reinforcement Learning in Embodied Agents
Tristan Tomilin, Meng Fang, Mykola Pechenizkiy
MOSDT: Self-Distillation-Based Decision Transformer for Multi-Agent Offline Safe Reinforcement Learning
Yuchen Xia, Yunjian Xu
Online Optimization for Offline Safe Reinforcement Learning
Yassine Chemingui, Aryan Deshwal, Alan Fern et al.
Provably Efficient RL under Episode-Wise Safety in Constrained MDPs with Linear Function Approximation
Toshinori Kitamura, Arnob Ghosh, Tadashi Kozuno et al.
Safe RLHF-V: Safe Reinforcement Learning from Multi-modal Human Feedback
Jiaming Ji, Xinyu Chen, Rui Pan et al.
SonoGym: High Performance Simulation for Challenging Surgical Tasks with Robotic Ultrasound
Yunke Ao, Masoud Moghani, Mayank Mittal et al.
Tilted Quantile Gradient Updates for Quantile-Constrained Reinforcement Learning
Chenglin Li, Guangchun Ruan, Hua Geng
Feasibility Consistent Representation Learning for Safe Reinforcement Learning
Zhepeng Cen, Yihang Yao, Zuxin Liu et al.
Feasible Reachable Policy Iteration
Shentao Qin, Yujie Yang, Yao Mu et al.
Imitate the Good and Avoid the Bad: An Incremental Approach to Safe Reinforcement Learning
Huy Hoang, Tien Mai, Pradeep Varakantham
Langevin Policy for Safe Reinforcement Learning
Fenghao Lei, Long Yang, Shiting Wen et al.
SafeDreamer: Safe Reinforcement Learning with World Models
Weidong Huang, Jiaming Ji, Chunhe Xia et al.
Safe Reinforcement Learning using Finite-Horizon Gradient-based Estimation
Juntao Dai, Yaodong Yang, Qian Zheng et al.
Temporal Logic Specification-Conditioned Decision Transformer for Offline Safe Reinforcement Learning
Zijian Guo, Weichao Zhou, Wenchao Li