"contextual bandits" Papers

33 papers found

A Bayesian Approach to Contextual Dynamic Pricing using the Proportional Hazards Model with Discrete Price Data

Dongguen Kim, Young-Geun Choi, Minwoo Chae

NEURIPS 2025

An Improved Algorithm for Adversarial Linear Contextual Bandits via Reduction

Tim van Erven, Jack Mayo, Julia Olkhovskaya et al.

NEURIPS 2025arXiv:2508.11931

BayesCNS: A Unified Bayesian Approach to Address Cold Start and Non-Stationarity in Search Systems at Scale

Randy Ardywibowo, Rakesh Sunki, Shin Tsz Lucy Kuo et al.

AAAI 2025paperarXiv:2410.02126

Contextual Linear Bandits with Delay as Payoff

Mengxiao Zhang, Yingfei Wang, Haipeng Luo

ICML 2025arXiv:2502.12528
3
citations

Contextual Thompson Sampling via Generation of Missing Data

Kelly W Zhang, Tianhui Cai, Hongseok Namkoong et al.

NEURIPS 2025arXiv:2502.07064
2
citations

Cross-Domain Off-Policy Evaluation and Learning for Contextual Bandits

Yuta Natsubori, Masataka Ushiku, Yuta Saito

ICLR 2025

Feel-Good Thompson Sampling for Contextual Bandits: a Markov Chain Monte Carlo Showdown

Emile Anand, Sarah Liaw

NEURIPS 2025arXiv:2507.15290
3
citations

Geometry Meets Incentives: Sample-Efficient Incentivized Exploration with Linear Contexts

Ben Schiffer, Mark Sellke

NEURIPS 2025spotlightarXiv:2506.01685

LLMs Are In-Context Bandit Reinforcement Learners

Giovanni Monea, Antoine Bosselut, Kianté Brantley et al.

COLM 2025paperarXiv:2410.05362
12
citations

MultiScale Contextual Bandits for Long Term Objectives

Richa Rastogi, Yuta Saito, Thorsten Joachims

NEURIPS 2025arXiv:2503.17674

Neural Combinatorial Clustered Bandits for Recommendation Systems

Baran Atalar, Carlee Joe-Wong

AAAI 2025paperarXiv:2410.14586
3
citations

Second Order Bounds for Contextual Bandits with Function Approximation

Aldo Pacchiano

ICLR 2025arXiv:2409.16197
7
citations

Sharp Analysis for KL-Regularized Contextual Bandits and RLHF

Heyang Zhao, Chenlu Ye, Quanquan Gu et al.

NEURIPS 2025arXiv:2411.04625
16
citations

Statistical Parity with Exponential Weights

Stephen Pasteris, Chris Hicks, Vasilios Mavroudis

NEURIPS 2025

Transfer Faster, Price Smarter: Minimax Dynamic Pricing under Cross-Market Preference Shift

Yi Zhang, Elynn Chen, Yujun Yan

NEURIPS 2025spotlightarXiv:2505.17203
1
citations

Triple-Optimistic Learning for Stochastic Contextual Bandits with General Constraints

Hengquan Guo, Lingkai Zu, Xin Liu

ICML 2025

True Impact of Cascade Length in Contextual Cascading Bandits

Hyun-jun Choi, Joongkyu Lee, Min-hwan Oh

NEURIPS 2025

Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits

Xuheng Li, Quanquan Gu

NEURIPS 2025arXiv:2511.02123
1
citations

$\mathtt{VITS}$ : Variational Inference Thompson Sampling for contextual bandits

Pierre Clavier, Tom Huix, Alain Oliviero Durmus

ICML 2024

A Contextual Combinatorial Bandit Approach to Negotiation

Yexin Li, Zhancun Mu, Siyuan Qi

ICML 2024arXiv:2407.00567
2
citations

Adaptively Learning to Select-Rank in Online Platforms

Jingyuan Wang, Perry Dong, Ying Jin et al.

ICML 2024arXiv:2406.05017

Borda Regret Minimization for Generalized Linear Dueling Bandits

Yue Wu, Tao Jin, Qiwei Di et al.

ICML 2024arXiv:2303.08816
15
citations

Efficient Contextual Bandits with Uninformed Feedback Graphs

Mengxiao Zhang, Yuheng Zhang, Haipeng Luo et al.

ICML 2024arXiv:2402.08127
5
citations

Fast and Sample Efficient Multi-Task Representation Learning in Stochastic Contextual Bandits

Jiabin Lin, Shana Moothedath, Namrata Vaswani

ICML 2024arXiv:2410.02068
8
citations

Federated Contextual Cascading Bandits with Asynchronous Communication and Heterogeneous Users

Hantao Yang, Xutong Liu, Zhiyong Wang et al.

AAAI 2024paperarXiv:2402.16312
9
citations

High-dimensional Linear Bandits with Knapsacks

Wanteng Ma, Dong Xia, Jiashuo Jiang

ICML 2024arXiv:2311.01327

Incentivized Learning in Principal-Agent Bandit Games

Antoine Scheid, Daniil Tiapkin, Etienne Boursier et al.

ICML 2024arXiv:2403.03811
17
citations

In-Context Reinforcement Learning for Variable Action Spaces

Viacheslav Sinii, Alexander Nikulin, Vladislav Kurenkov et al.

ICML 2024arXiv:2312.13327
25
citations

Low-Rank Bandits via Tight Two-to-Infinity Singular Subspace Recovery

Yassir Jedra, William Réveillard, Stefan Stojanovic et al.

ICML 2024arXiv:2402.15739
2
citations

More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning

Kaiwen Wang, Owen Oertell, Alekh Agarwal et al.

ICML 2024arXiv:2402.07198
17
citations

Prospective Side Information for Latent MDPs

Jeongyeol Kwon, Yonathan Efroni, Shie Mannor et al.

ICML 2024spotlightarXiv:2310.07596
7
citations

Randomized Confidence Bounds for Stochastic Partial Monitoring

Maxime Heuillet, Ola Ahmad, Audrey Durand

ICML 2024arXiv:2402.05002
2
citations

The Non-linear $F$-Design and Applications to Interactive Learning

Alekh Agarwal, Jian Qian, Alexander Rakhlin et al.

ICML 2024