"sequential decision-making" Papers

33 papers found

Active Reinforcement Learning Strategies for Offline Policy Improvement

Ambedkar Dukkipati, Ranga Shaarad Ayyagari, Bodhisattwa Dasgupta et al.

AAAI 2025paperarXiv:2412.13106
3
citations

Adaptive Frontier Exploration on Graphs with Applications to Network-Based Disease Testing

XianJun, Davin Choo, Yuqi Pan, Tonghan Wang et al.

NEURIPS 2025arXiv:2505.21671
3
citations

Adaptive teachers for amortized samplers

Minsu Kim, Sanghyeok Choi, Taeyoung Yun et al.

ICLR 2025arXiv:2410.01432
20
citations

Adaptive Variance Inflation in Thompson Sampling: Efficiency, Safety, Robustness, and Beyond

Feng Zhu, David Simchi-Levi

NEURIPS 2025

Blindfolded Experts Generalize Better: Insights from Robotic Manipulation and Videogames

Ev Zisselman, Mirco Mutti, Shelly Francis-Meretzki et al.

NEURIPS 2025oralarXiv:2510.24194

DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback

Zaid Khan, Elias Stengel-Eskin, Jaemin Cho et al.

ICLR 2025arXiv:2410.06215
9
citations

Data Mixture Optimization: A Multi-fidelity Multi-scale Bayesian Framework

Thomson Yen, Andrew Siah, Haozhe Chen et al.

NEURIPS 2025arXiv:2503.21023
2
citations

DynaAct: Large Language Model Reasoning with Dynamic Action Spaces

Xueliang Zhao, Wei Wu, Jian Guan et al.

NEURIPS 2025arXiv:2511.08043

Efficient Reinforcement Learning with Large Language Model Priors

Xue Yan, Yan Song, Xidong Feng et al.

ICLR 2025arXiv:2410.07927
21
citations

Emergent Risk Awareness in Rational Agents under Resource Constraints

Daniel Jarne Ornia, Nicholas Bishop, Joel Dyer et al.

NEURIPS 2025arXiv:2505.23436
2
citations

GFlowVLM: Enhancing Multi-step Reasoning in Vision-Language Models with Generative Flow Networks

Haoqiang Kang, Enna Sachdeva, Piyush Gupta et al.

CVPR 2025arXiv:2503.06514
8
citations

Learning from negative feedback, or positive feedback or both

Abbas Abdolmaleki, Bilal Piot, Bobak Shahriari et al.

ICLR 2025arXiv:2410.04166
8
citations

Learning Robust Representations with Long-Term Information for Generalization in Visual Reinforcement Learning

Rui Yang, Jie Wang, Qijie Peng et al.

ICLR 2025
1
citations

Mean-Field Sampling for Cooperative Multi-Agent Reinforcement Learning

Emile Anand, Ishani Karmarkar, Guannan Qu

NEURIPS 2025spotlightarXiv:2412.00661
5
citations

No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes

Jasmine Bayrooti, Sattar Vakili, Amanda Prorok et al.

NEURIPS 2025oralarXiv:2510.20725

On Evaluating Policies for Robust POMDPs

Merlijn Krale, Eline M. Bovy, Maris F. L. Galesloot et al.

NEURIPS 2025

Pairwise Elimination with Instance-Dependent Guarantees for Bandits with Cost Subsidy

Ishank Juneja, Carlee Joe-Wong, Osman Yagan

ICLR 2025arXiv:2501.10290

Prediction with expert advice under additive noise

Alankrita Bhatt, Victoria Kostina

NEURIPS 2025

RF-Agent: Automated Reward Function Design via Language Agent Tree Search

Ning Gao, Xiuhui Zhang, Xingyu Jiang et al.

NEURIPS 2025spotlight

Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks

Vishnu Sarukkai, Zhiqiang Xie, Kayvon Fatahalian

NEURIPS 2025arXiv:2505.00234
6
citations

Solving Robust Markov Decision Processes: Generic, Reliable, Efficient

Tobias Meggendorfer, Maximilian Weininger, Patrick Wienhöft

AAAI 2025paperarXiv:2412.10185
7
citations

Adapting Static Fairness to Sequential Decision-Making: Bias Mitigation Strategies towards Equal Long-term Benefit Rate

Yuancheng Xu, Chenghao Deng, Yanchao Sun et al.

ICML 2024oralarXiv:2309.03426
7
citations

Agent-Specific Effects: A Causal Effect Propagation Analysis in Multi-Agent MDPs

Stelios Triantafyllou, Aleksa Sukovic, Debmalya Mandal et al.

ICML 2024arXiv:2310.11334

ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL

Yifei Zhou, Andrea Zanette, Jiayi Pan et al.

ICML 2024oralarXiv:2402.19446
135
citations

Best Arm Identification for Stochastic Rising Bandits

Marco Mussi, Alessandro Montenegro, Francesco Trovò et al.

ICML 2024spotlightarXiv:2302.07510
4
citations

Evidential Active Recognition: Intelligent and Prudent Open-World Embodied Perception

Lei Fan, Mingfu Liang, Yunxuan Li et al.

CVPR 2024arXiv:2311.13793
16
citations

Imitation Learning from Purified Demonstrations

Yunke Wang, Minjing Dong, Yukun Zhao et al.

ICML 2024arXiv:2310.07143
1
citations

Limited Preference Aided Imitation Learning from Imperfect Demonstrations

Xingchen Cao, Fan-Ming Luo, Junyin Ye et al.

ICML 2024

Offline Transition Modeling via Contrastive Energy Learning

Ruifeng Chen, Chengxing Jia, Zefang Huang et al.

ICML 2024

Parameterized Projected Bellman Operator

Théo Vincent, Alberto Maria Metelli, Boris Belousov et al.

AAAI 2024paperarXiv:2312.12869
4
citations

Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving

Ming Nie, Renyuan Peng, Chunwei Wang et al.

ECCV 2024arXiv:2312.03661
115
citations

Reinforcement Learning Meets Visual Odometry

Nico Messikommer, Giovanni Cioffi, Mathias Gehrig et al.

ECCV 2024arXiv:2407.15626
14
citations

Rethinking Transformers in Solving POMDPs

Chenhao Lu, Ruizhe Shi, Yuyao Liu et al.

ICML 2024arXiv:2405.17358
9
citations