"markov decision process" Papers

14 papers found

AutoEdit: Automatic Hyperparameter Tuning for Image Editing

Chau Pham, Quan Dao, Mahesh Bhosale et al.

NEURIPS 2025arXiv:2509.15031
1
citations

FFCG: Effective and Fast Family Column Generation for Solving Large-Scale Linear Program

Yi-Xiang Hu, Feng Wu, Shaoang Li et al.

AAAI 2025paperarXiv:2412.19066

Graph-Supported Dynamic Algorithm Configuration for Multi-Objective Combinatorial Optimization

Robbert Reijnen, Yaoxin Wu, Zaharah Bukhsh et al.

ICML 2025arXiv:2505.16471
1
citations

Multi-step Visual Reasoning with Visual Tokens Scaling and Verification

Tianyi Bai, Zengjie Hu, Fupeng Sun et al.

NEURIPS 2025arXiv:2506.07235
14
citations

Multivariate Dynamic Mediation Analysis under a Reinforcement Learning Framework

Lan Luo, Chengchun Shi, Jitao Wang et al.

NEURIPS 2025arXiv:2310.16203
2
citations

On Shallow Planning Under Partial Observability

Randy Lefebvre, Audrey Durand

AAAI 2025paperarXiv:2407.15820
2
citations

Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization

Timofei Gritsaev, Nikita Morozov, Sergey Samsonov et al.

ICLR 2025arXiv:2410.15474
5
citations

AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation

Zanlin Ni, Yulin Wang, Renping Zhou et al.

ECCV 2024arXiv:2409.00342
16
citations

Adapting Static Fairness to Sequential Decision-Making: Bias Mitigation Strategies towards Equal Long-term Benefit Rate

Yuancheng Xu, Chenghao Deng, Yanchao Sun et al.

ICML 2024oralarXiv:2309.03426
7
citations

Reinforcement Learning and Regret Bounds for Admission Control

Lucas Weber, Ana Busic, Jiamin ZHU

ICML 2024arXiv:2406.04766
1
citations

Reward Model Learning vs. Direct Policy Optimization: A Comparative Analysis of Learning from Human Preferences

Andi Nika, Debmalya Mandal, Parameswaran Kamalaruban et al.

ICML 2024arXiv:2403.01857
20
citations

RL-CFR: Improving Action Abstraction for Imperfect Information Extensive-Form Games with Reinforcement Learning

Boning Li, Zhixuan Fang, Longbo Huang

ICML 2024arXiv:2403.04344
5
citations

Robust Optimization in Protein Fitness Landscapes Using Reinforcement Learning in Latent Space

Minji Lee, Luiz Felipe Vecchietti, Hyunkyu Jung et al.

ICML 2024spotlightarXiv:2405.18986
17
citations

Towards Optimal Adversarial Robust Q-learning with Bellman Infinity-error

Haoran Li, Zicheng Zhang, Wang Luo et al.

ICML 2024arXiv:2402.02165
3
citations