"policy learning" Papers

21 papers found

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

Bootstrapped Model Predictive Control

Yuhang Wang, Hanwei Guo, Sizhe Wang et al.

ICLR 2025arXiv:2503.18871

citations

Bridging Equivariant GNNs and Spherical CNNs for Structured Physical Domains

Colin Kohler, Purvik Patel, Nathan Vaska et al.

NEURIPS 2025

BTBS-LNS: Binarized-Tightening, Branch and Search on Learning LNS Policies for MIP

Hao Yuan, wenli ouyang, Changwen Zhang et al.

ICLR 2025

Compliant Residual DAgger: Improving Real-World Contact-Rich Manipulation with Human Corrections

Xiaomeng Xu, Yifan Hou, Zeyi Liu et al.

NEURIPS 2025arXiv:2506.16685

citations

Defending Against Sophisticated Poisoning Attacks with RL-based Aggregation in Federated Learning

Yujing Wang, Hainan Zhang, Sijia Wen et al.

AAAI 2025paperarXiv:2406.14217

citations

DriveDPO: Policy Learning via Safety DPO For End-to-End Autonomous Driving

Shuyao Shang, Yuntao Chen, Yuqi Wang et al.

NEURIPS 2025arXiv:2509.17940

citations

Efficient Reinforcement Learning Through Adaptively Pretrained Visual Encoder

Yuhan Zhang, Guoqing Ma, Guangfu Hao et al.

AAAI 2025paperarXiv:2502.05555

citations

EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception

Sanjoy Chowdhury, Subrata Biswas, Sayan Nag et al.

ICCV 2025arXiv:2506.21080

citations

LaRes: Evolutionary Reinforcement Learning with LLM-based Adaptive Reward Search

Pengyi Li, Hongyao Tang, Jinbin Qiao et al.

NEURIPS 2025

Learning 3D Persistent Embodied World Models

Siyuan Zhou, Yilun Du, Yuncong Yang et al.

NEURIPS 2025arXiv:2505.05495

citations

PN-GAIL: Leveraging Non-optimal Information from Imperfect Demonstrations

Qiang Liu, Huiqiao Fu, Kaiqiang Tang et al.

ICLR 2025

SimulMEGA: MoE Routers are Advanced Policy Makers for Simultaneous Speech Translation

Chenyang Le, Bing Han, Jinshun Li et al.

NEURIPS 2025arXiv:2509.01200

What Matters in Learning from Large-Scale Datasets for Robot Manipulation

Vaibhav Saxena, Matthew Bronars, Nadun Ranawaka Arachchige et al.

ICLR 2025arXiv:2506.13536

citations

DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching

Guanghe Li, Yixiang Shan, Zhengbang Zhu et al.

ICML 2024arXiv:2402.02439

citations

Effect-Invariant Mechanisms for Policy Generalization

Sorawit Saengkyongam, Niklas Pfister, Predag Klasnja et al.

ICML 2024arXiv:2306.10983

Fair Off-Policy Learning from Observational Data

Dennis Frauen, Valentyn Melnychuk, Stefan Feuerriegel

ICML 2024oralarXiv:2303.08516

citations

Learning Uncertainty-Aware Temporally-Extended Actions

Joongkyu Lee, Seung Joon Park, Yunhao Tang et al.

AAAI 2024paperarXiv:2402.05439

citations

Off-policy Evaluation Beyond Overlap: Sharp Partial Identification Under Smoothness

Samir Khan, Martin Saveski, Johan Ugander

ICML 2024

Pausing Policy Learning in Non-stationary Reinforcement Learning

Hyunin Lee, Ming Jin, Javad Lavaei et al.

ICML 2024oralarXiv:2405.16053

citations

Policy Learning for Balancing Short-Term and Long-Term Rewards

Peng Wu, Ziyu Shen, Feng Xie et al.

ICML 2024arXiv:2405.03329

citations

Reinforcement Learning within Tree Search for Fast Macro Placement

Zijie Geng, Jie Wang, Ziyan Liu et al.

ICML 2024