"contextual bandits" Papers

33 papers found

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

A Bayesian Approach to Contextual Dynamic Pricing using the Proportional Hazards Model with Discrete Price Data

Dongguen Kim, Young-Geun Choi, Minwoo Chae

NEURIPS 2025

An Improved Algorithm for Adversarial Linear Contextual Bandits via Reduction

Tim van Erven, Jack Mayo, Julia Olkhovskaya et al.

NEURIPS 2025arXiv:2508.11931

BayesCNS: A Unified Bayesian Approach to Address Cold Start and Non-Stationarity in Search Systems at Scale

Randy Ardywibowo, Rakesh Sunki, Shin Tsz Lucy Kuo et al.

AAAI 2025paperarXiv:2410.02126

Contextual Linear Bandits with Delay as Payoff

Mengxiao Zhang, Yingfei Wang, Haipeng Luo

ICML 2025arXiv:2502.12528

citations

Contextual Thompson Sampling via Generation of Missing Data

Kelly W Zhang, Tianhui Cai, Hongseok Namkoong et al.

NEURIPS 2025arXiv:2502.07064

citations

Cross-Domain Off-Policy Evaluation and Learning for Contextual Bandits

Yuta Natsubori, Masataka Ushiku, Yuta Saito

ICLR 2025

Feel-Good Thompson Sampling for Contextual Bandits: a Markov Chain Monte Carlo Showdown

Emile Anand, Sarah Liaw

NEURIPS 2025arXiv:2507.15290

citations

Geometry Meets Incentives: Sample-Efficient Incentivized Exploration with Linear Contexts

Ben Schiffer, Mark Sellke

NEURIPS 2025spotlightarXiv:2506.01685

LLMs Are In-Context Bandit Reinforcement Learners

Giovanni Monea, Antoine Bosselut, Kianté Brantley et al.

COLM 2025paperarXiv:2410.05362

citations

MultiScale Contextual Bandits for Long Term Objectives

Richa Rastogi, Yuta Saito, Thorsten Joachims

NEURIPS 2025arXiv:2503.17674

Neural Combinatorial Clustered Bandits for Recommendation Systems

Baran Atalar, Carlee Joe-Wong

AAAI 2025paperarXiv:2410.14586

citations

Second Order Bounds for Contextual Bandits with Function Approximation

Aldo Pacchiano

ICLR 2025arXiv:2409.16197

citations

Sharp Analysis for KL-Regularized Contextual Bandits and RLHF

Heyang Zhao, Chenlu Ye, Quanquan Gu et al.

NEURIPS 2025arXiv:2411.04625

citations

Statistical Parity with Exponential Weights

Stephen Pasteris, Chris Hicks, Vasilios Mavroudis

NEURIPS 2025

Transfer Faster, Price Smarter: Minimax Dynamic Pricing under Cross-Market Preference Shift

Yi Zhang, Elynn Chen, Yujun Yan

NEURIPS 2025spotlightarXiv:2505.17203

citations

Triple-Optimistic Learning for Stochastic Contextual Bandits with General Constraints

Hengquan Guo, Lingkai Zu, Xin Liu

ICML 2025

True Impact of Cascade Length in Contextual Cascading Bandits

Hyun-jun Choi, Joongkyu Lee, Min-hwan Oh

NEURIPS 2025

Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits

Xuheng Li, Quanquan Gu

NEURIPS 2025arXiv:2511.02123

citations

$\mathtt{VITS}$ : Variational Inference Thompson Sampling for contextual bandits

Pierre Clavier, Tom Huix, Alain Oliviero Durmus

ICML 2024

A Contextual Combinatorial Bandit Approach to Negotiation

Yexin Li, Zhancun Mu, Siyuan Qi

ICML 2024arXiv:2407.00567

citations

Adaptively Learning to Select-Rank in Online Platforms

Jingyuan Wang, Perry Dong, Ying Jin et al.

ICML 2024arXiv:2406.05017

Borda Regret Minimization for Generalized Linear Dueling Bandits

Yue Wu, Tao Jin, Qiwei Di et al.

ICML 2024arXiv:2303.08816

citations

Efficient Contextual Bandits with Uninformed Feedback Graphs

Mengxiao Zhang, Yuheng Zhang, Haipeng Luo et al.

ICML 2024arXiv:2402.08127

citations

Fast and Sample Efficient Multi-Task Representation Learning in Stochastic Contextual Bandits

Jiabin Lin, Shana Moothedath, Namrata Vaswani

ICML 2024arXiv:2410.02068

citations

Federated Contextual Cascading Bandits with Asynchronous Communication and Heterogeneous Users

Hantao Yang, Xutong Liu, Zhiyong Wang et al.

AAAI 2024paperarXiv:2402.16312

citations

High-dimensional Linear Bandits with Knapsacks

Wanteng Ma, Dong Xia, Jiashuo Jiang

ICML 2024arXiv:2311.01327

Incentivized Learning in Principal-Agent Bandit Games

Antoine Scheid, Daniil Tiapkin, Etienne Boursier et al.

ICML 2024arXiv:2403.03811

citations

In-Context Reinforcement Learning for Variable Action Spaces

Viacheslav Sinii, Alexander Nikulin, Vladislav Kurenkov et al.

ICML 2024arXiv:2312.13327

citations

Low-Rank Bandits via Tight Two-to-Infinity Singular Subspace Recovery

Yassir Jedra, William Réveillard, Stefan Stojanovic et al.

ICML 2024arXiv:2402.15739

citations

More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning

Kaiwen Wang, Owen Oertell, Alekh Agarwal et al.

ICML 2024arXiv:2402.07198

citations

Prospective Side Information for Latent MDPs

Jeongyeol Kwon, Yonathan Efroni, Shie Mannor et al.

ICML 2024spotlightarXiv:2310.07596

citations

Randomized Confidence Bounds for Stochastic Partial Monitoring

Maxime Heuillet, Ola Ahmad, Audrey Durand

ICML 2024arXiv:2402.05002

citations

The Non-linear $F$-Design and Applications to Interactive Learning

Alekh Agarwal, Jian Qian, Alexander Rakhlin et al.

ICML 2024