"multi-armed bandits" Papers

21 papers found

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

Breaking the $\log(1/\Delta_2)$ Barrier: Better Batched Best Arm Identification with Adaptive Grids

Tianyuan Jin, Qin Zhang, Dongruo Zhou

ICLR 2025

Constrained Feedback Learning for Non-Stationary Multi-Armed Bandits

Shaoang Li, Jian Li

NEURIPS 2025arXiv:2509.15073

Efficient Top-m Data Values Identification for Data Selection

Xiaoqiang Lin, Xinyi Xu, See-Kiong Ng et al.

ICLR 2025

Evolution of Information in Interactive Decision Making: A Case Study for Multi-Armed Bandits

Yuzhou Gu, Yanjun Han, Jian Qian

NEURIPS 2025oralarXiv:2503.00273

citations

Heterogeneous Multi-Agent Bandits with Parsimonious Hints

Amirmahdi Mirfakhar, Xuchuang Wang, Jinhang Zuo et al.

AAAI 2025paperarXiv:2502.16128

citations

LASeR: Learning to Adaptively Select Reward Models with Multi-Arm Bandits

Duy Nguyen, Archiki Prasad, Elias Stengel-Eskin et al.

NEURIPS 2025arXiv:2410.01735

citations

Learning Across the Gap: Hybrid Multi-armed Bandits with Heterogeneous Offline and Online Data

Qijia He, Minghan Wang, Xutong Liu et al.

NEURIPS 2025

On Speeding Up Language Model Evaluation

Jin Zhou, Christian Belardi, Ruihan Wu et al.

ICLR 2025arXiv:2407.06172

citations

Pairwise Elimination with Instance-Dependent Guarantees for Bandits with Cost Subsidy

Ishank Juneja, Carlee Joe-Wong, Osman Yagan

ICLR 2025arXiv:2501.10290

Pareto Optimal Risk-Agnostic Distributional Bandits with Heavy-Tail Rewards

Kyungjae Lee, Dohyeong Kim, Taehyun Cho et al.

NEURIPS 2025

Revisiting Follow-the-Perturbed-Leader with Unbounded Perturbations in Bandit Problems

Jongyeong Lee, Junya Honda, Shinji Ito et al.

NEURIPS 2025arXiv:2508.18604

citations

Satisficing Regret Minimization in Bandits

Qing Feng, Tianyi Ma, Ruihao Zhu

ICLR 2025arXiv:2406.06802

citations

Tightening Regret Lower and Upper Bounds in Restless Rising Bandits

Cristiano Migali, Marco Mussi, Gianmarco Genalti et al.

NEURIPS 2025

uniINF: Best-of-Both-Worlds Algorithm for Parameter-Free Heavy-Tailed MABs

Yu Chen, Jiatai Huang, Yan Dai et al.

ICLR 2025arXiv:2410.03284

citations

Causal Bandits: The Pareto Optimal Frontier of Adaptivity, a Reduction to Linear Bandits, and Limitations around Unknown Marginals

Ziyi Liu, Idan Attias, Daniel Roy

ICML 2024arXiv:2407.00950

citations

Communication-Efficient Collaborative Regret Minimization in Multi-Armed Bandits

Nikolai Karpov, Qin Zhang

AAAI 2024paperarXiv:2301.11442

citations

Factored-Reward Bandits with Intermediate Observations

Marco Mussi, Simone Drago, Marcello Restelli et al.

ICML 2024

Federated Combinatorial Multi-Agent Multi-Armed Bandits

Fares Fourati, Mohamed-Slim Alouini, Vaneet Aggarwal

ICML 2024arXiv:2405.05950

citations

Incentivized Learning in Principal-Agent Bandit Games

Antoine Scheid, Daniil Tiapkin, Etienne Boursier et al.

ICML 2024arXiv:2403.03811

citations

Leveraging (Biased) Information: Multi-armed Bandits with Offline Data

Wang Chi Cheung, Lixing Lyu

ICML 2024spotlight

On Interpolating Experts and Multi-Armed Bandits

Houshuang Chen, Yuchen He, Chihao Zhang

ICML 2024arXiv:2307.07264

citations