"thompson sampling" Papers

21 papers found

Adaptive Variance Inflation in Thompson Sampling: Efficiency, Safety, Robustness, and Beyond

Feng Zhu, David Simchi-Levi

NEURIPS 2025

Contextual Thompson Sampling via Generation of Missing Data

Kelly W Zhang, Tianhui Cai, Hongseok Namkoong et al.

NEURIPS 2025arXiv:2502.07064
2
citations

Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling

Jasmine Bayrooti, Carl Ek, Amanda Prorok

ICLR 2025arXiv:2410.04988
3
citations

FedRTS: Federated Robust Pruning via Combinatorial Thompson Sampling

Hong Huang, Jinhai Yang, Yuan Chen et al.

NEURIPS 2025arXiv:2501.19122
4
citations

Feel-Good Thompson Sampling for Contextual Bandits: a Markov Chain Monte Carlo Showdown

Emile Anand, Sarah Liaw

NEURIPS 2025arXiv:2507.15290
3
citations

Generator-Mediated Bandits: Thompson Sampling for GenAI-Powered Adaptive Interventions

Marc Brooks, Gabriel Durham, Kihyuk Hong et al.

NEURIPS 2025arXiv:2505.16311
1
citations

Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits

Yuwei Luo, Mohsen Bayati

ICLR 2025arXiv:2306.14872
2
citations

LaRes: Evolutionary Reinforcement Learning with LLM-based Adaptive Reward Search

Pengyi Li, Hongyao Tang, Jinbin Qiao et al.

NEURIPS 2025

No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes

Jasmine Bayrooti, Sattar Vakili, Amanda Prorok et al.

NEURIPS 2025oralarXiv:2510.20725

Thompson Sampling in Function Spaces via Neural Operators

Rafael Oliveira, Xuesong Wang, Kian Ming Chai et al.

NEURIPS 2025arXiv:2506.21894

Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits

Xuheng Li, Quanquan Gu

NEURIPS 2025arXiv:2511.02123
1
citations

$\mathtt{VITS}$ : Variational Inference Thompson Sampling for contextual bandits

Pierre Clavier, Tom Huix, Alain Oliviero Durmus

ICML 2024

A Bayesian Approach to Online Planning

Nir Greshler, David Ben Eli, Carmel Rabinovitz et al.

ICML 2024arXiv:2406.02103
1
citations

Adaptive Anytime Multi-Agent Path Finding Using Bandit-Based Large Neighborhood Search

Thomy Phan, Taoan Huang, Bistra Dilkina et al.

AAAI 2024paperarXiv:2312.16767
11
citations

Efficient Exploration for LLMs

Vikranth Dwaracherla, Seyed Mohammad Asghari, Botao Hao et al.

ICML 2024arXiv:2402.00396
37
citations

Feel-Good Thompson Sampling for Contextual Dueling Bandits

Xuheng Li, Heyang Zhao, Quanquan Gu

ICML 2024arXiv:2404.06013
17
citations

Finite-Time Frequentist Regret Bounds of Multi-Agent Thompson Sampling on Sparse Hypergraphs

Tianyuan Jin, Hao-Lun Hsu, William Chang et al.

AAAI 2024paperarXiv:2312.15549
3
citations

Posterior Sampling-Based Bayesian Optimization with Tighter Bayesian Regret Bounds

Shion Takeno, Yu Inatsu, Masayuki Karasuyama et al.

ICML 2024arXiv:2311.03760
9
citations

ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive Advantages

Andrew Jesson, Christopher Lu, Gunshi Gupta et al.

ICML 2024arXiv:2306.01460
10
citations

The Choice of Noninformative Priors for Thompson Sampling in Multiparameter Bandit Models

Jongyeong Lee, Chao-Kai Chiang, Masashi Sugiyama

AAAI 2024paperarXiv:2302.14407

Two-sided Competing Matching Recommendation Markets With Quota and Complementary Preferences Constraints

Yuantong Li, Guang Cheng, Xiaowu Dai

ICML 2024arXiv:2301.10230
8
citations