Poster "regret bound analysis" Papers
6 papers found
Conference
Efficient and Near-Optimal Algorithm for Contextual Dueling Bandits with Offline Regression Oracles
Aadirupa Saha, Robert Schapire
NEURIPS 2025
Lasso Bandit with Compatibility Condition on Optimal Arm
Harin Lee, Taehyun Hwang, Min-hwan Oh
ICLR 2025arXiv:2406.00823
4
citations
Learning Personalized Ad Impact via Contextual Reinforcement Learning under Delayed Rewards
Yuwei Cheng, Zifeng Zhao, Haifeng Xu
NEURIPS 2025arXiv:2510.20055
Parameter-free Algorithms for the Stochastically Extended Adversarial Model
Shuche Wang, Adarsh Barik, Peng Zhao et al.
NEURIPS 2025arXiv:2510.04685
Spectral Learning for Infinite-Horizon Average-Reward POMDPs
Alessio Russo, Alberto Maria Metelli, Marcello Restelli
NEURIPS 2025
Provably Efficient Reinforcement Learning for Adversarial Restless Multi-Armed Bandits with Unknown Transitions and Bandit Feedback
GUOJUN XIONG, Jian Li
ICML 2024arXiv:2405.00950
1
citations