"multi-armed bandit" Papers
6 papers found
Conference
Beyond Greedy Exits: Improved Early Exit Decisions for Risk Control and Reliability
Divya Jyoti Bajpai, Manjesh Kumar Hanawal
NEURIPS 2025arXiv:2509.23666
1
citations
Delay as Payoff in MAB
Ofir Schlisselberg, Ido Cohen, Tal Lancewicki et al.
AAAI 2025paperarXiv:2408.15158
4
citations
Precise Asymptotics and Refined Regret of Variance-Aware UCB
Yingying Fan, Yuxuan Han, Jinchi Lv et al.
NEURIPS 2025spotlightarXiv:2412.08843
1
citations
ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning
Mingqi Yuan, Bo Li, Xin Jin et al.
ICCV 2025arXiv:2503.06101
1
citations
Finite-Time Frequentist Regret Bounds of Multi-Agent Thompson Sampling on Sparse Hypergraphs
Tianyuan Jin, Hao-Lun Hsu, William Chang et al.
AAAI 2024paperarXiv:2312.15549
3
citations
On Multi-Armed Bandit with Impatient Arms
Yuming Shao, Zhixuan Fang
ICML 2024