"exploration-exploitation tradeoff" Papers
12 papers found
Conference
Bayesian Optimization for Unknown Cost-Varying Variable Subsets with No-Regret Costs
Vu Viet Hoang, Quoc Anh Hoang Nguyen, Hung The Tran
AAAI 2025paperarXiv:2412.15863
Breaking the $\log(1/\Delta_2)$ Barrier: Better Batched Best Arm Identification with Adaptive Grids
Tianyuan Jin, Qin Zhang, Dongruo Zhou
ICLR 2025
Feel-Good Thompson Sampling for Contextual Bandits: a Markov Chain Monte Carlo Showdown
Emile Anand, Sarah Liaw
NEURIPS 2025arXiv:2507.15290
3
citations
Geometry Meets Incentives: Sample-Efficient Incentivized Exploration with Linear Contexts
Ben Schiffer, Mark Sellke
NEURIPS 2025spotlightarXiv:2506.01685
LASeR: Towards Diversified and Generalizable Robot Design with Large Language Models
JUNRU SONG, Yang Yang, Huan Xiao et al.
ICLR 2025
7
citations
Learning to price with resource constraints: from full information to machine-learned prices
Ruicheng Ao, Jiashuo Jiang, David Simchi-Levi
NEURIPS 2025arXiv:2501.14155
3
citations
Offline-to-Online Hyperparameter Transfer for Stochastic Bandits
Dravyansh Sharma, Arun Suggala
AAAI 2025paperarXiv:2501.02926
8
citations
Online Feedback Efficient Active Target Discovery in Partially Observable Environments
Anindya Sarkar, Binglin Ji, Yevgeniy Vorobeychik
NEURIPS 2025arXiv:2505.06535
1
citations
PlanU: Large Language Model Reasoning through Planning under Uncertainty
Ziwei Deng, Mian Deng, Chenjing Liang et al.
NEURIPS 2025arXiv:2510.18442
Entropy-Reinforced Planning with Large Language Models for Drug Discovery
Xuefeng Liu, Chih-chan Tien, Peng Ding et al.
ICML 2024arXiv:2406.07025
7
citations
Optimal Batched Linear Bandits
Xuanfei Ren, Tianyuan Jin, Pan Xu
ICML 2024arXiv:2406.04137
5
citations
Stochastic Bandits with ReLU Neural Networks
Kan Xu, Hamsa Bastani, Surbhi Goel et al.
ICML 2024arXiv:2405.07331
1
citations