"reward shaping" Papers

14 papers found

Asymmetric REINFORCE for off-Policy Reinforcement Learning: Balancing positive and negative rewards

Charles Arnal, Gaëtan Narozniak, Vivien Cabannes et al.

NEURIPS 2025arXiv:2506.20520
17
citations

BAMDP Shaping: a Unified Framework for Intrinsic Motivation and Reward Shaping

Aly Lidayan, Michael Dennis, Stuart Russell

ICLR 2025arXiv:2409.05358
4
citations

GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning

Kelin Yu, Sheng Zhang, Harshit Soora et al.

ICCV 2025arXiv:2508.11049
4
citations

GoalLadder: Incremental Goal Discovery with Vision-Language Models

Alexey Zakharov, Shimon Whiteson

NEURIPS 2025arXiv:2506.16396
1
citations

HYPRL: Reinforcement Learning of Control Policies for Hyperproperties

Tzu-Han Hsu, Arshia Rafieioskouei, Borzoo Bonakdarpour

NEURIPS 2025arXiv:2504.04675
2
citations

LOPT: Learning Optimal Pigovian Tax in Sequential Social Dilemmas

Yun Hua, Shang Gao, Wenhao Li et al.

NEURIPS 2025

Preference Distillation via Value based Reinforcement Learning

Minchan Kwon, Junwon Ko, Kangil kim et al.

NEURIPS 2025arXiv:2509.16965

Progress Reward Model for Reinforcement Learning via Large Language Models

Xiuhui Zhang, Ning Gao, Xingyu Jiang et al.

NEURIPS 2025

S-GRPO: Early Exit via Reinforcement Learning in Reasoning Models

Muzhi Dai, Chenxu Yang, Qingyi Si

NEURIPS 2025oralarXiv:2505.07686
52
citations

TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs

Yunheng Li, Jing Cheng, Shaoyong Jia et al.

NEURIPS 2025oralarXiv:2509.18056
7
citations

Time Reversal Symmetry for Efficient Robotic Manipulations in Deep Reinforcement Learning

Yunpeng Jiang, Jianshu Hu, Paul Weng et al.

NEURIPS 2025oralarXiv:2505.13925

Vision-Based Generic Potential Function for Policy Alignment in Multi-Agent Reinforcement Learning

Hao Ma, Shijie Wang, Zhiqiang Pu et al.

AAAI 2025paperarXiv:2502.13430

EvIL: Evolution Strategies for Generalisable Imitation Learning

Silvia Sapora, Gokul Swamy, Christopher Lu et al.

ICML 2024arXiv:2406.11905
8
citations

Reward Shaping for Reinforcement Learning with An Assistant Reward Agent

Haozhe Ma, Kuankuan Sima, Thanh Vinh Vo et al.

ICML 2024