"bandit feedback" Papers

16 papers found

Comparing Uniform Price and Discriminatory Multi-Unit Auctions through Regret Minimization

Marius Potfer, Vianney Perchet

NEURIPS 2025arXiv:2510.19591

Last-iterate Convergence in Regularized Graphon Mean Field Game

Jing Dong, Baoxiang Wang, Yaoliang Yu

AAAI 2025paperarXiv:2410.08746
2
citations

No-Regret Learning Under Adversarial Resource Constraints: A Spending Plan Is All You Need!

Francesco Emanuele Stradi, Matteo Castiglioni, Alberto Marchesi et al.

NEURIPS 2025arXiv:2506.13244
2
citations

No-Regret Online Autobidding Algorithms in First-price Auctions

Yilin LI, Yuan Deng, Wei Tang et al.

NEURIPS 2025arXiv:2510.16869
1
citations

Online Nonsubmodular Optimization with Delayed Feedback in the Bandit Setting

Sifan Yang, Yuanyu Wan, Lijun Zhang

AAAI 2025paperarXiv:2508.00523
1
citations

Revisiting Projection-Free Online Learning with Time-Varying Constraints

Yibo Wang, Yuanyu Wan, Lijun Zhang

AAAI 2025paperarXiv:2501.16046
5
citations

Uniform Wrappers: Bridging Concave to Quadratizable Functions in Online Optimization

Mohammad Pedramfar, Christopher Quinn, Vaneet Aggarwal

NEURIPS 2025

Efficient Online Set-valued Classification with Bandit Feedback

Zhou Wang, Xingye Qiao

ICML 2024arXiv:2405.04393
1
citations

Federated Combinatorial Multi-Agent Multi-Armed Bandits

Fares Fourati, Mohamed-Slim Alouini, Vaneet Aggarwal

ICML 2024arXiv:2405.05950
8
citations

Handling Heterogeneous Curvatures in Bandit LQR Control

Yu-Hu Yan, Jing Wang, Peng Zhao

ICML 2024spotlight

On Interpolating Experts and Multi-Armed Bandits

Houshuang Chen, Yuchen He, Chihao Zhang

ICML 2024arXiv:2307.07264
5
citations

Performative Prediction with Bandit Feedback: Learning through Reparameterization

Yatong Chen, Wei Tang, Chien-Ju Ho et al.

ICML 2024arXiv:2305.01094
12
citations

Projection-Free Online Convex Optimization with Time-Varying Constraints

Dan Garber, Ben Kretzu

ICML 2024arXiv:2402.08799
5
citations

Provably Efficient Reinforcement Learning for Adversarial Restless Multi-Armed Bandits with Unknown Transitions and Bandit Feedback

GUOJUN XIONG, Jian Li

ICML 2024arXiv:2405.00950
1
citations

Quantum Algorithm for Online Exp-concave Optimization

Jianhao He, Chengchang Liu, Xutong Liu et al.

ICML 2024arXiv:2410.19688
3
citations

Rate-Optimal Policy Optimization for Linear Markov Decision Processes

Uri Sherman, Alon Cohen, Tomer Koren et al.

ICML 2024arXiv:2308.14642
9
citations