Poster "policy gradient methods" Papers

23 papers found

$\phi$-Update: A Class of Policy Update Methods with Policy Convergence Guarantee

Wenye Li, Jiacai Liu, Ke Wei

ICLR 2025
3
citations

A Policy-Gradient Approach to Solving Imperfect-Information Games with Best-Iterate Convergence

Mingyang Liu, Gabriele Farina, Asuman Ozdaglar

ICLR 2025arXiv:2408.00751
3
citations

Ask a Strong LLM Judge when Your Reward Model is Uncertain

Zhenghao Xu, Qin Lu, Qingru Zhang et al.

NEURIPS 2025arXiv:2510.20369
3
citations

Cross-Domain Off-Policy Evaluation and Learning for Contextual Bandits

Yuta Natsubori, Masataka Ushiku, Yuta Saito

ICLR 2025

Diffusion Policy Policy Optimization

Allen Ren, Justin Lidard, Lars Ankile et al.

ICLR 2025arXiv:2409.00588
146
citations

Enhancing Diversity In Parallel Agents: A Maximum State Entropy Exploration Story

Vincenzo De Paola, Riccardo Zamboni, Mirco Mutti et al.

ICML 2025arXiv:2505.01336
3
citations

Mitigating Information Loss in Tree-Based Reinforcement Learning via Direct Optimization

Sascha Marton, Tim Grams, Florian Vogt et al.

ICLR 2025arXiv:2408.08761
4
citations

On the Convergence of Projected Policy Gradient for Any Constant Step Sizes

Jiacai Liu, Wenye Li, Dachao Lin et al.

NEURIPS 2025arXiv:2311.01104
4
citations

Policy Gradient with Kernel Quadrature

Tetsuro Morimura, Satoshi Hayakawa

ICLR 2025arXiv:2310.14768
1
citations

REINFORCE Converges to Optimal Policies with Any Learning Rate

Samuel Robertson, Thang Chu, Bo Dai et al.

NEURIPS 2025

Self-Improvement for Neural Combinatorial Optimization: Sample Without Replacement, but Improvement

Dominik Grimm, Jonathan Pirnay

ICLR 2025arXiv:2403.15180
28
citations

Accelerated Policy Gradient for s-rectangular Robust MDPs with Large State Spaces

Ziyi Chen, Heng Huang

ICML 2024

Accelerated Policy Gradient: On the Convergence Rates of the Nesterov Momentum for Reinforcement Learning

Yen-Ju Chen, Nai-Chieh Huang, Ching-pei Lee et al.

ICML 2024arXiv:2310.11897
5
citations

Do Transformer World Models Give Better Policy Gradients?

Michel Ma, Tianwei Ni, Clement Gehring et al.

ICML 2024arXiv:2402.05290
7
citations

GFlowNet Training by Policy Gradients

Puhua Niu, Shili Wu, Mingzhou Fan et al.

ICML 2024arXiv:2408.05885
3
citations

How to Explore with Belief: State Entropy Maximization in POMDPs

Riccardo Zamboni, Duilio Cirino, Marcello Restelli et al.

ICML 2024arXiv:2406.02295
6
citations

Major-Minor Mean Field Multi-Agent Reinforcement Learning

Kai Cui, Christian Fabian, Anam Tahir et al.

ICML 2024arXiv:2303.10665
5
citations

Mollification Effects of Policy Gradient Methods

Tao Wang, Sylvia Herbert, Sicun Gao

ICML 2024arXiv:2405.17832
1
citations

Optimistic Multi-Agent Policy Gradient

Wenshuai Zhao, Yi Zhao, Zhiyuan Li et al.

ICML 2024arXiv:2311.01953
5
citations

Risk-Sensitive Policy Optimization via Predictive CVaR Policy Gradient

Ju-Hyun Kim, Seungki Min

ICML 2024

SAPG: Split and Aggregate Policy Gradients

Jayesh Singla, Ananye Agarwal, Deepak Pathak

ICML 2024arXiv:2407.20230
13
citations

Stabilizing Policy Gradients for Stochastic Differential Equations via Consistency with Perturbation Process

Xiangxin Zhou, Liang Wang, Yichi Zhou

ICML 2024arXiv:2403.04154
8
citations

Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles

Bhrij Patel, Wesley A. Suttle, Alec Koppel et al.

ICML 2024arXiv:2403.11925
4
citations