"markov decision process" Papers
14 papers found
Conference
AutoEdit: Automatic Hyperparameter Tuning for Image Editing
Chau Pham, Quan Dao, Mahesh Bhosale et al.
NEURIPS 2025arXiv:2509.15031
1
citations
FFCG: Effective and Fast Family Column Generation for Solving Large-Scale Linear Program
Yi-Xiang Hu, Feng Wu, Shaoang Li et al.
AAAI 2025paperarXiv:2412.19066
Graph-Supported Dynamic Algorithm Configuration for Multi-Objective Combinatorial Optimization
Robbert Reijnen, Yaoxin Wu, Zaharah Bukhsh et al.
ICML 2025arXiv:2505.16471
1
citations
Multi-step Visual Reasoning with Visual Tokens Scaling and Verification
Tianyi Bai, Zengjie Hu, Fupeng Sun et al.
NEURIPS 2025arXiv:2506.07235
14
citations
Multivariate Dynamic Mediation Analysis under a Reinforcement Learning Framework
Lan Luo, Chengchun Shi, Jitao Wang et al.
NEURIPS 2025arXiv:2310.16203
2
citations
On Shallow Planning Under Partial Observability
Randy Lefebvre, Audrey Durand
AAAI 2025paperarXiv:2407.15820
2
citations
Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization
Timofei Gritsaev, Nikita Morozov, Sergey Samsonov et al.
ICLR 2025arXiv:2410.15474
5
citations
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Zanlin Ni, Yulin Wang, Renping Zhou et al.
ECCV 2024arXiv:2409.00342
16
citations
Adapting Static Fairness to Sequential Decision-Making: Bias Mitigation Strategies towards Equal Long-term Benefit Rate
Yuancheng Xu, Chenghao Deng, Yanchao Sun et al.
ICML 2024oralarXiv:2309.03426
7
citations
Reinforcement Learning and Regret Bounds for Admission Control
Lucas Weber, Ana Busic, Jiamin ZHU
ICML 2024arXiv:2406.04766
1
citations
Reward Model Learning vs. Direct Policy Optimization: A Comparative Analysis of Learning from Human Preferences
Andi Nika, Debmalya Mandal, Parameswaran Kamalaruban et al.
ICML 2024arXiv:2403.01857
20
citations
RL-CFR: Improving Action Abstraction for Imperfect Information Extensive-Form Games with Reinforcement Learning
Boning Li, Zhixuan Fang, Longbo Huang
ICML 2024arXiv:2403.04344
5
citations
Robust Optimization in Protein Fitness Landscapes Using Reinforcement Learning in Latent Space
Minji Lee, Luiz Felipe Vecchietti, Hyunkyu Jung et al.
ICML 2024spotlightarXiv:2405.18986
17
citations
Towards Optimal Adversarial Robust Q-learning with Bellman Infinity-error
Haoran Li, Zicheng Zhang, Wang Luo et al.
ICML 2024arXiv:2402.02165
3
citations