"temporal difference learning" Papers

18 papers found

A General-Purpose Theorem for High-Probability Bounds of Stochastic Approximation with Polyak Averaging

Sajad Khodadadian, Martin Zubeldia

NEURIPS 2025oralarXiv:2505.21796
2
citations

Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners

Michal Nauman, Marek Cygan, Carmelo Sferrazza et al.

NEURIPS 2025oralarXiv:2505.23150
9
citations

Fast and Slow Streams for Online Time Series Forecasting Without Information Leakage

Ying-yee Ava Lau, Zhiwen Shao, Dit-Yan Yeung

ICLR 2025oral
8
citations

On the Linear Speedup of Personalized Federated Reinforcement Learning with Shared Representations

GUOJUN XIONG, Shufan Wang, Daniel Jiang et al.

ICLR 2025oralarXiv:2411.15014
4
citations

Physics-informed Temporal Difference Metric Learning for Robot Motion Planning

Ruiqi Ni, zherong pan, Ahmed Hussain Qureshi

ICLR 2025oralarXiv:2505.05691
2
citations

Real-Time Recurrent Reinforcement Learning

Julian Lemmel, Radu Grosu

AAAI 2025paperarXiv:2311.04830
6
citations

Revisiting a Design Choice in Gradient Temporal Difference Learning

Xiaochi Qian, Shangtong Zhang

ICLR 2025oralarXiv:2308.01170
6
citations

Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining

Jie Cheng, Ruixi Qiao, ma yingwei et al.

ICLR 2025oralarXiv:2410.00564
8
citations

Simplifying Deep Temporal Difference Learning

Matteo Gallici, Mattie Fellows, Benjamin Ellis et al.

ICLR 2025oralarXiv:2407.04811
56
citations

Temporal Difference Learning: Why It Can Be Fast and How It Will Be Faster

Patrick Schnell, Luca Guastoni, Nils Thuerey

ICLR 2025oral

The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise

Shuze Daniel Liu, Shuhang Chen, Shangtong Zhang

NEURIPS 2025oralarXiv:2401.07844
13
citations

Towards Provable Emergence of In-Context Reinforcement Learning

Jiuqi Wang, Rohan Chandra, Shangtong Zhang

NEURIPS 2025oralarXiv:2509.18389
1
citations

Transformers Can Learn Temporal Difference Methods for In-Context Reinforcement Learning

Jiuqi Wang, Ethan Blaser, Hadi Daneshmand et al.

ICLR 2025oralarXiv:2405.13861
15
citations

An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks

Zhifa Ke, Zaiwen Wen, Junyu Zhang

ICML 2024oralarXiv:2405.04017
1
citations

ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL

Yifei Zhou, Andrea Zanette, Jiayi Pan et al.

ICML 2024oralarXiv:2402.19446
135
citations

Discerning Temporal Difference Learning

Jianfei Ma

AAAI 2024paperarXiv:2310.08091
1
citations

Non-Asymptotic Analysis for Single-Loop (Natural) Actor-Critic with Compatible Function Approximation

Yudan Wang, Yue Wang, Yi Zhou et al.

ICML 2024oralarXiv:2406.01762
10
citations

Pausing Policy Learning in Non-stationary Reinforcement Learning

Hyunin Lee, Ming Jin, Javad Lavaei et al.

ICML 2024oralarXiv:2405.16053
2
citations