"off-policy learning" Papers

9 papers found

Filters:off-policy learning Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

Bootstrap Off-policy with World Model

Guojian Zhan, Likun Wang, Xiangteng Zhang et al.

NEURIPS 2025arXiv:2511.00423

MultiScale Contextual Bandits for Long Term Objectives

Richa Rastogi, Yuta Saito, Thorsten Joachims

NEURIPS 2025arXiv:2503.17674

Revisiting a Design Choice in Gradient Temporal Difference Learning

Xiaochi Qian, Shangtong Zhang

ICLR 2025oralarXiv:2308.01170

SEMDICE: Off-policy State Entropy Maximization via Stationary Distribution Correction Estimation

Jongmin Lee, Meiqi Sun, Pieter Abbeel

ICLR 2025arXiv:2512.10042

ShiQ: Bringing back Bellman to LLMs

Pierre Clavier, Nathan Grinsztajn, Raphaël Avalos et al.

NEURIPS 2025arXiv:2505.11081

Simplifying Deep Temporal Difference Learning

Matteo Gallici, Mattie Fellows, Benjamin Ellis et al.

ICLR 2025oralarXiv:2407.04811

Turning Sand to Gold: Recycling Data to Bridge On-Policy and Off-Policy Learning via Causal Bound

Tal Fiskus, Uri Shaham

NEURIPS 2025arXiv:2507.11269

Value Improved Actor Critic Algorithms

Yaniv Oren, Moritz Zanger, Pascal van der Vaart et al.

NEURIPS 2025arXiv:2406.01423

Learning to Explore in POMDPs with Informational Rewards

Annie Xie, Logan M. Bhamidipaty, Evan Liu et al.