"episodic markov decision processes" Papers
2 papers found
Conference
Adapting to Stochastic and Adversarial Losses in Episodic MDPs with Aggregate Bandit Feedback
Shinji Ito, Kevin Jamieson, Haipeng Luo et al.
NEURIPS 2025arXiv:2510.17103
2
citations
Sample Efficient Reinforcement Learning with Partial Dynamics Knowledge
Meshal Alharbi, Mardavij Roozbehani, Munther Dahleh
AAAI 2024paperarXiv:2312.12558
4
citations