"exploration-exploitation tradeoff" Papers

12 papers found

Bayesian Optimization for Unknown Cost-Varying Variable Subsets with No-Regret Costs

Vu Viet Hoang, Quoc Anh Hoang Nguyen, Hung The Tran

AAAI 2025paperarXiv:2412.15863

Breaking the $\log(1/\Delta_2)$ Barrier: Better Batched Best Arm Identification with Adaptive Grids

Tianyuan Jin, Qin Zhang, Dongruo Zhou

ICLR 2025

Feel-Good Thompson Sampling for Contextual Bandits: a Markov Chain Monte Carlo Showdown

Emile Anand, Sarah Liaw

NEURIPS 2025arXiv:2507.15290
3
citations

Geometry Meets Incentives: Sample-Efficient Incentivized Exploration with Linear Contexts

Ben Schiffer, Mark Sellke

NEURIPS 2025spotlightarXiv:2506.01685

LASeR: Towards Diversified and Generalizable Robot Design with Large Language Models

JUNRU SONG, Yang Yang, Huan Xiao et al.

ICLR 2025
7
citations

Learning to price with resource constraints: from full information to machine-learned prices

Ruicheng Ao, Jiashuo Jiang, David Simchi-Levi

NEURIPS 2025arXiv:2501.14155
3
citations

Offline-to-Online Hyperparameter Transfer for Stochastic Bandits

Dravyansh Sharma, Arun Suggala

AAAI 2025paperarXiv:2501.02926
8
citations

Online Feedback Efficient Active Target Discovery in Partially Observable Environments

Anindya Sarkar, Binglin Ji, Yevgeniy Vorobeychik

NEURIPS 2025arXiv:2505.06535
1
citations

PlanU: Large Language Model Reasoning through Planning under Uncertainty

Ziwei Deng, Mian Deng, Chenjing Liang et al.

NEURIPS 2025arXiv:2510.18442

Entropy-Reinforced Planning with Large Language Models for Drug Discovery

Xuefeng Liu, Chih-chan Tien, Peng Ding et al.

ICML 2024arXiv:2406.07025
7
citations

Optimal Batched Linear Bandits

Xuanfei Ren, Tianyuan Jin, Pan Xu

ICML 2024arXiv:2406.04137
5
citations

Stochastic Bandits with ReLU Neural Networks

Kan Xu, Hamsa Bastani, Surbhi Goel et al.

ICML 2024arXiv:2405.07331
1
citations