Distributional Bellman Operators over Mean Embeddings

4citations

arXiv:2312.07358 PDF

citations

#1543

in ICML 2024

of 2635 papers

Top Authors

Data Points

Top Authors

Li Kevin Wenliang Gregoire Deletang Matthew Aitchison Marcus Hutter Anian Ruoss Arthur Gretton Mark Rowland

Topics

distributional reinforcement learning mean embeddings bellman operators sketch bellman operator temporal-difference algorithms dynamic programming deep reinforcement learning

Abstract

We propose a novel algorithmic framework for distributional reinforcement learning, based on learning finite-dimensional mean embeddings of return distributions. The framework reveals a wide variety of new algorithms for dynamic programming and temporal-difference algorithms that rely on the sketch Bellman operator, which updates mean embeddings with simple linear-algebraic computations. We provide asymptotic convergence theory, and examine the empirical performance of the algorithms on a suite of tabular tasks. Further, we show that this approach can be straightforwardly combined with deep reinforcement learning.

Citation History

Jan 28, 2026

Feb 13, 2026

4+4

Feb 13, 2026