Temporal Distance-aware Transition Augmentation for Offline Model-based Reinforcement Learning

3
citations
#1367
in ICML 2025
of 3340 papers
2
Top Authors
5
Data Points

Abstract

The goal of offline reinforcement learning (RL) is to extract the best possible policy from the previously collected dataset considering theout-of-distribution(OOD) sample issue. Offline model-based RL (MBRL) is a captivating solution capable of alleviating such issues through a \textit{state-action transition augmentation} with a learned dynamic model. Unfortunately, offline MBRL methods have been observed to fail in sparse rewarded and long-horizon environments for a long time. In this work, we propose a novel MBRL method, dubbed Temporal Distance-Aware Transition Augmentation (TempDATA), that generates additional transitions in a geometrically structured representation space, instead of state space. For comprehending long-horizon behaviors efficiently, our main idea is to learn state abstraction, which captures atemporal distancefrom bothtrajectory and transition levelsof state space. Our experiments empirically confirm that TempDATA outperforms previous offline MBRL methods and achieves matching or surpassing the performance of diffusion-based trajectory augmentation and goal-conditioned RL on the D4RL AntMaze, FrankaKitchen, CALVIN, and pixel-based FrankaKitchen.

Citation History

Jan 27, 2026
3
Feb 13, 2026
3
Feb 13, 2026
3
Feb 13, 2026
3
Feb 13, 2026
3