"policy gradient theorem" Papers
3 papers found
Conference
Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach
Swetha Ganesh, Vaneet Aggarwal
NEURIPS 2025arXiv:2505.19986
3
citations
Scalable Discrete Diffusion Samplers: Combinatorial Optimization and Statistical Physics
Sebastian Sanokowski, Wilhelm Berghammer, Haoyu Wang et al.
ICLR 2025arXiv:2502.08696
17
citations
Adaptive Horizon Actor-Critic for Policy Learning in Contact-Rich Differentiable Simulation
Ignat Georgiev, Krishnan Srinivasan, Jie Xu et al.
ICML 2024arXiv:2405.17784
24
citations