"inverse reinforcement learning" Papers
17 papers found
Conference
Approximated Variational Bayesian Inverse Reinforcement Learning for Large Language Model Alignment
Yuang Cai, Yuyu Yuan, Jinsheng Shi et al.
Estimating cognitive biases with attention-aware inverse planning
Sounak Banerjee, Daphne Cornelisse, Deepak Gopinath et al.
Foresight in Motion: Reinforcing Trajectory Prediction with Reward Heuristics
Muleilan Pei, Shaoshuai Shi, Xuesong Chen et al.
Insights from the Inverse: Reconstructing LLM Training Goals Through Inverse Reinforcement Learning
Jared Joselowitz, Ritam Majumdar, Arjun Jagota et al.
Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching
Arnav Kumar Jain, Harley Wiltzer, Jesse Farebrother et al.
Ranking-based Preference Optimization for Diffusion Models from Implicit User Feedback
Yi-Lun Wu, Bo-Kai Ruan, Chiang Tseng et al.
Trajectory Graph Learning: Aligning with Long Trajectories in Reinforcement Learning Without Reward Design
Yunfan Li, Eric Liu, Lin Yang
A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback
Kihyun Kim, Jiawei Zhang, Asuman Ozdaglar et al.
Environment Design for Inverse Reinforcement Learning
Thomas Kleine Buening, Victor Villin, Christos Dimitrakakis
EvIL: Evolution Strategies for Generalisable Imitation Learning
Silvia Sapora, Gokul Swamy, Christopher Lu et al.
Expert Proximity as Surrogate Rewards for Single Demonstration Imitation Learning
Chia-Cheng Chiang, Li-Cheng Lan, Wei-Fang Sun et al.
Hybrid Inverse Reinforcement Learning
Juntao Ren, Gokul Swamy, Steven Wu et al.
Is Inverse Reinforcement Learning Harder than Standard Reinforcement Learning? A Theoretical Perspective
Lei Zhao, Mengdi Wang, Yu Bai
Massively Scalable Inverse Reinforcement Learning in Google Maps
Matt Barnes, Matthew Abueg, Oliver Lange et al.
Offline Inverse RL: New Solution Concepts and Provably Efficient Algorithms
Filippo Lazzati, Mirco Mutti, Alberto Maria Metelli
Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input
Andi Peng, Yuying Sun, Tianmin Shu et al.
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Han Shen, Zhuoran Yang, Tianyi Chen