"policy learning" Papers

21 papers found

Bootstrapped Model Predictive Control

Yuhang Wang, Hanwei Guo, Sizhe Wang et al.

ICLR 2025arXiv:2503.18871
6
citations

Bridging Equivariant GNNs and Spherical CNNs for Structured Physical Domains

Colin Kohler, Purvik Patel, Nathan Vaska et al.

NEURIPS 2025

BTBS-LNS: Binarized-Tightening, Branch and Search on Learning LNS Policies for MIP

Hao Yuan, wenli ouyang, Changwen Zhang et al.

ICLR 2025

Compliant Residual DAgger: Improving Real-World Contact-Rich Manipulation with Human Corrections

Xiaomeng Xu, Yifan Hou, Zeyi Liu et al.

NEURIPS 2025arXiv:2506.16685
7
citations

Defending Against Sophisticated Poisoning Attacks with RL-based Aggregation in Federated Learning

Yujing Wang, Hainan Zhang, Sijia Wen et al.

AAAI 2025paperarXiv:2406.14217
3
citations

DriveDPO: Policy Learning via Safety DPO For End-to-End Autonomous Driving

Shuyao Shang, Yuntao Chen, Yuqi Wang et al.

NEURIPS 2025arXiv:2509.17940
8
citations

Efficient Reinforcement Learning Through Adaptively Pretrained Visual Encoder

Yuhan Zhang, Guoqing Ma, Guangfu Hao et al.

AAAI 2025paperarXiv:2502.05555
1
citations

EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception

Sanjoy Chowdhury, Subrata Biswas, Sayan Nag et al.

ICCV 2025arXiv:2506.21080
2
citations

LaRes: Evolutionary Reinforcement Learning with LLM-based Adaptive Reward Search

Pengyi Li, Hongyao Tang, Jinbin Qiao et al.

NEURIPS 2025

Learning 3D Persistent Embodied World Models

Siyuan Zhou, Yilun Du, Yuncong Yang et al.

NEURIPS 2025arXiv:2505.05495
17
citations

PN-GAIL: Leveraging Non-optimal Information from Imperfect Demonstrations

Qiang Liu, Huiqiao Fu, Kaiqiang Tang et al.

ICLR 2025

SimulMEGA: MoE Routers are Advanced Policy Makers for Simultaneous Speech Translation

Chenyang Le, Bing Han, Jinshun Li et al.

NEURIPS 2025arXiv:2509.01200

What Matters in Learning from Large-Scale Datasets for Robot Manipulation

Vaibhav Saxena, Matthew Bronars, Nadun Ranawaka Arachchige et al.

ICLR 2025arXiv:2506.13536
17
citations

DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching

Guanghe Li, Yixiang Shan, Zhengbang Zhu et al.

ICML 2024arXiv:2402.02439
36
citations

Effect-Invariant Mechanisms for Policy Generalization

Sorawit Saengkyongam, Niklas Pfister, Predag Klasnja et al.

ICML 2024arXiv:2306.10983

Fair Off-Policy Learning from Observational Data

Dennis Frauen, Valentyn Melnychuk, Stefan Feuerriegel

ICML 2024oralarXiv:2303.08516
10
citations

Learning Uncertainty-Aware Temporally-Extended Actions

Joongkyu Lee, Seung Joon Park, Yunhao Tang et al.

AAAI 2024paperarXiv:2402.05439
3
citations

Off-policy Evaluation Beyond Overlap: Sharp Partial Identification Under Smoothness

Samir Khan, Martin Saveski, Johan Ugander

ICML 2024

Pausing Policy Learning in Non-stationary Reinforcement Learning

Hyunin Lee, Ming Jin, Javad Lavaei et al.

ICML 2024oralarXiv:2405.16053
2
citations

Policy Learning for Balancing Short-Term and Long-Term Rewards

Peng Wu, Ziyu Shen, Feng Xie et al.

ICML 2024arXiv:2405.03329
14
citations

Reinforcement Learning within Tree Search for Fast Macro Placement

Zijie Geng, Jie Wang, Ziyan Liu et al.

ICML 2024