"sequential decision-making" Papers
33 papers found
Conference
Active Reinforcement Learning Strategies for Offline Policy Improvement
Ambedkar Dukkipati, Ranga Shaarad Ayyagari, Bodhisattwa Dasgupta et al.
Adaptive Frontier Exploration on Graphs with Applications to Network-Based Disease Testing
XianJun, Davin Choo, Yuqi Pan, Tonghan Wang et al.
Adaptive teachers for amortized samplers
Minsu Kim, Sanghyeok Choi, Taeyoung Yun et al.
Adaptive Variance Inflation in Thompson Sampling: Efficiency, Safety, Robustness, and Beyond
Feng Zhu, David Simchi-Levi
Blindfolded Experts Generalize Better: Insights from Robotic Manipulation and Videogames
Ev Zisselman, Mirco Mutti, Shelly Francis-Meretzki et al.
DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback
Zaid Khan, Elias Stengel-Eskin, Jaemin Cho et al.
Data Mixture Optimization: A Multi-fidelity Multi-scale Bayesian Framework
Thomson Yen, Andrew Siah, Haozhe Chen et al.
DynaAct: Large Language Model Reasoning with Dynamic Action Spaces
Xueliang Zhao, Wei Wu, Jian Guan et al.
Efficient Reinforcement Learning with Large Language Model Priors
Xue Yan, Yan Song, Xidong Feng et al.
Emergent Risk Awareness in Rational Agents under Resource Constraints
Daniel Jarne Ornia, Nicholas Bishop, Joel Dyer et al.
GFlowVLM: Enhancing Multi-step Reasoning in Vision-Language Models with Generative Flow Networks
Haoqiang Kang, Enna Sachdeva, Piyush Gupta et al.
Learning from negative feedback, or positive feedback or both
Abbas Abdolmaleki, Bilal Piot, Bobak Shahriari et al.
Learning Robust Representations with Long-Term Information for Generalization in Visual Reinforcement Learning
Rui Yang, Jie Wang, Qijie Peng et al.
Mean-Field Sampling for Cooperative Multi-Agent Reinforcement Learning
Emile Anand, Ishani Karmarkar, Guannan Qu
No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes
Jasmine Bayrooti, Sattar Vakili, Amanda Prorok et al.
On Evaluating Policies for Robust POMDPs
Merlijn Krale, Eline M. Bovy, Maris F. L. Galesloot et al.
Pairwise Elimination with Instance-Dependent Guarantees for Bandits with Cost Subsidy
Ishank Juneja, Carlee Joe-Wong, Osman Yagan
Prediction with expert advice under additive noise
Alankrita Bhatt, Victoria Kostina
RF-Agent: Automated Reward Function Design via Language Agent Tree Search
Ning Gao, Xiuhui Zhang, Xingyu Jiang et al.
Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks
Vishnu Sarukkai, Zhiqiang Xie, Kayvon Fatahalian
Solving Robust Markov Decision Processes: Generic, Reliable, Efficient
Tobias Meggendorfer, Maximilian Weininger, Patrick Wienhöft
Adapting Static Fairness to Sequential Decision-Making: Bias Mitigation Strategies towards Equal Long-term Benefit Rate
Yuancheng Xu, Chenghao Deng, Yanchao Sun et al.
Agent-Specific Effects: A Causal Effect Propagation Analysis in Multi-Agent MDPs
Stelios Triantafyllou, Aleksa Sukovic, Debmalya Mandal et al.
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
Yifei Zhou, Andrea Zanette, Jiayi Pan et al.
Best Arm Identification for Stochastic Rising Bandits
Marco Mussi, Alessandro Montenegro, Francesco Trovò et al.
Evidential Active Recognition: Intelligent and Prudent Open-World Embodied Perception
Lei Fan, Mingfu Liang, Yunxuan Li et al.
Imitation Learning from Purified Demonstrations
Yunke Wang, Minjing Dong, Yukun Zhao et al.
Limited Preference Aided Imitation Learning from Imperfect Demonstrations
Xingchen Cao, Fan-Ming Luo, Junyin Ye et al.
Offline Transition Modeling via Contrastive Energy Learning
Ruifeng Chen, Chengxing Jia, Zefang Huang et al.
Parameterized Projected Bellman Operator
Théo Vincent, Alberto Maria Metelli, Boris Belousov et al.
Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving
Ming Nie, Renyuan Peng, Chunwei Wang et al.
Reinforcement Learning Meets Visual Odometry
Nico Messikommer, Giovanni Cioffi, Mathias Gehrig et al.
Rethinking Transformers in Solving POMDPs
Chenhao Lu, Ruizhe Shi, Yuyao Liu et al.