"policy gradient estimation" Papers
2 papers found
Conference
Differentiable Information Enhanced Model-Based Reinforcement Learning
Xiaoyuan Zhang, Xinyan Cai, Bo Liu et al.
AAAI 2025paperarXiv:2503.01178
3
citations
MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions
Yekun Chai, Haoran Sun, Huang Fang et al.
ICLR 2025oralarXiv:2410.02743
9
citations