A Distributional Analogue to the Successor Representation

10citations

arXiv:2402.08530 PDF Project

citations

#1012

in ICML 2024

of 2635 papers

Top Authors

Data Points

Top Authors

Harley Wiltzer Jesse Farebrother Arthur Gretton Yunhao Tang Andre Barreto Will Dabney Marc Bellemare Mark Rowland

Topics

distributional reinforcement learning successor representation distributional successor measure zero-shot policy evaluation risk-sensitive evaluation maximum mean discrepancy generative state models transition structure separation

Abstract

This paper contributes a new approach for distributional reinforcement learning which elucidates a clean separation of transition structure and reward in the learning process. Analogous to how the successor representation (SR) describes the expected consequences of behaving according to a given policy, our distributional successor measure (SM) describes the distributional consequences of this behaviour. We formulate the distributional SM as a distribution over distributions and provide theory connecting it with distributional and model-based reinforcement learning. Moreover, we propose an algorithm that learns the distributional SM from data by minimizing a two-level maximum mean discrepancy. Key to our method are a number of algorithmic techniques that are independently valuable for learning generative models of state. As an illustration of the usefulness of the distributional SM, we show that it enables zero-shot risk-sensitive policy evaluation in a way that was not previously possible.

Citation History

Jan 28, 2026

Feb 13, 2026

10+10

Feb 13, 2026