"contextual bandits" Papers
33 papers found
Conference
A Bayesian Approach to Contextual Dynamic Pricing using the Proportional Hazards Model with Discrete Price Data
Dongguen Kim, Young-Geun Choi, Minwoo Chae
An Improved Algorithm for Adversarial Linear Contextual Bandits via Reduction
Tim van Erven, Jack Mayo, Julia Olkhovskaya et al.
BayesCNS: A Unified Bayesian Approach to Address Cold Start and Non-Stationarity in Search Systems at Scale
Randy Ardywibowo, Rakesh Sunki, Shin Tsz Lucy Kuo et al.
Contextual Linear Bandits with Delay as Payoff
Mengxiao Zhang, Yingfei Wang, Haipeng Luo
Contextual Thompson Sampling via Generation of Missing Data
Kelly W Zhang, Tianhui Cai, Hongseok Namkoong et al.
Cross-Domain Off-Policy Evaluation and Learning for Contextual Bandits
Yuta Natsubori, Masataka Ushiku, Yuta Saito
Feel-Good Thompson Sampling for Contextual Bandits: a Markov Chain Monte Carlo Showdown
Emile Anand, Sarah Liaw
Geometry Meets Incentives: Sample-Efficient Incentivized Exploration with Linear Contexts
Ben Schiffer, Mark Sellke
LLMs Are In-Context Bandit Reinforcement Learners
Giovanni Monea, Antoine Bosselut, Kianté Brantley et al.
MultiScale Contextual Bandits for Long Term Objectives
Richa Rastogi, Yuta Saito, Thorsten Joachims
Neural Combinatorial Clustered Bandits for Recommendation Systems
Baran Atalar, Carlee Joe-Wong
Second Order Bounds for Contextual Bandits with Function Approximation
Aldo Pacchiano
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Heyang Zhao, Chenlu Ye, Quanquan Gu et al.
Statistical Parity with Exponential Weights
Stephen Pasteris, Chris Hicks, Vasilios Mavroudis
Transfer Faster, Price Smarter: Minimax Dynamic Pricing under Cross-Market Preference Shift
Yi Zhang, Elynn Chen, Yujun Yan
Triple-Optimistic Learning for Stochastic Contextual Bandits with General Constraints
Hengquan Guo, Lingkai Zu, Xin Liu
True Impact of Cascade Length in Contextual Cascading Bandits
Hyun-jun Choi, Joongkyu Lee, Min-hwan Oh
Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits
Xuheng Li, Quanquan Gu
$\mathtt{VITS}$ : Variational Inference Thompson Sampling for contextual bandits
Pierre Clavier, Tom Huix, Alain Oliviero Durmus
A Contextual Combinatorial Bandit Approach to Negotiation
Yexin Li, Zhancun Mu, Siyuan Qi
Adaptively Learning to Select-Rank in Online Platforms
Jingyuan Wang, Perry Dong, Ying Jin et al.
Borda Regret Minimization for Generalized Linear Dueling Bandits
Yue Wu, Tao Jin, Qiwei Di et al.
Efficient Contextual Bandits with Uninformed Feedback Graphs
Mengxiao Zhang, Yuheng Zhang, Haipeng Luo et al.
Fast and Sample Efficient Multi-Task Representation Learning in Stochastic Contextual Bandits
Jiabin Lin, Shana Moothedath, Namrata Vaswani
Federated Contextual Cascading Bandits with Asynchronous Communication and Heterogeneous Users
Hantao Yang, Xutong Liu, Zhiyong Wang et al.
High-dimensional Linear Bandits with Knapsacks
Wanteng Ma, Dong Xia, Jiashuo Jiang
Incentivized Learning in Principal-Agent Bandit Games
Antoine Scheid, Daniil Tiapkin, Etienne Boursier et al.
In-Context Reinforcement Learning for Variable Action Spaces
Viacheslav Sinii, Alexander Nikulin, Vladislav Kurenkov et al.
Low-Rank Bandits via Tight Two-to-Infinity Singular Subspace Recovery
Yassir Jedra, William Réveillard, Stefan Stojanovic et al.
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning
Kaiwen Wang, Owen Oertell, Alekh Agarwal et al.
Prospective Side Information for Latent MDPs
Jeongyeol Kwon, Yonathan Efroni, Shie Mannor et al.
Randomized Confidence Bounds for Stochastic Partial Monitoring
Maxime Heuillet, Ola Ahmad, Audrey Durand
The Non-linear $F$-Design and Applications to Interactive Learning
Alekh Agarwal, Jian Qian, Alexander Rakhlin et al.