"regret bound analysis" Papers
9 papers found
Conference
Efficient and Near-Optimal Algorithm for Contextual Dueling Bandits with Offline Regression Oracles
Aadirupa Saha, Robert Schapire
NEURIPS 2025
Efficient Reinforcement Learning in Probabilistic Reward Machines
Xiaofeng Lin, Xuezhou Zhang
AAAI 2025paperarXiv:2408.10381
2
citations
Lasso Bandit with Compatibility Condition on Optimal Arm
Harin Lee, Taehyun Hwang, Min-hwan Oh
ICLR 2025arXiv:2406.00823
4
citations
Learning Personalized Ad Impact via Contextual Reinforcement Learning under Delayed Rewards
Yuwei Cheng, Zifeng Zhao, Haifeng Xu
NEURIPS 2025arXiv:2510.20055
Parameter-free Algorithms for the Stochastically Extended Adversarial Model
Shuche Wang, Adarsh Barik, Peng Zhao et al.
NEURIPS 2025arXiv:2510.04685
Spectral Learning for Infinite-Horizon Average-Reward POMDPs
Alessio Russo, Alberto Maria Metelli, Marcello Restelli
NEURIPS 2025
Combinatorial Stochastic-Greedy Bandit
Fares Fourati, Christopher John Quinn, Mohamed-Slim Alouini et al.
AAAI 2024paperarXiv:2312.08057
10
citations
Monte Carlo Tree Search in the Presence of Transition Uncertainty
Farnaz Kohankhaki, Kiarash Aghakasiri, Hongming Zhang et al.
AAAI 2024paperarXiv:2312.11348
3
citations
Provably Efficient Reinforcement Learning for Adversarial Restless Multi-Armed Bandits with Unknown Transitions and Bandit Feedback
GUOJUN XIONG, Jian Li
ICML 2024arXiv:2405.00950
1
citations