"bandit feedback" Papers
16 papers found
Conference
Comparing Uniform Price and Discriminatory Multi-Unit Auctions through Regret Minimization
Marius Potfer, Vianney Perchet
NEURIPS 2025arXiv:2510.19591
Last-iterate Convergence in Regularized Graphon Mean Field Game
Jing Dong, Baoxiang Wang, Yaoliang Yu
AAAI 2025paperarXiv:2410.08746
2
citations
No-Regret Learning Under Adversarial Resource Constraints: A Spending Plan Is All You Need!
Francesco Emanuele Stradi, Matteo Castiglioni, Alberto Marchesi et al.
NEURIPS 2025arXiv:2506.13244
2
citations
No-Regret Online Autobidding Algorithms in First-price Auctions
Yilin LI, Yuan Deng, Wei Tang et al.
NEURIPS 2025arXiv:2510.16869
1
citations
Online Nonsubmodular Optimization with Delayed Feedback in the Bandit Setting
Sifan Yang, Yuanyu Wan, Lijun Zhang
AAAI 2025paperarXiv:2508.00523
1
citations
Revisiting Projection-Free Online Learning with Time-Varying Constraints
Yibo Wang, Yuanyu Wan, Lijun Zhang
AAAI 2025paperarXiv:2501.16046
5
citations
Uniform Wrappers: Bridging Concave to Quadratizable Functions in Online Optimization
Mohammad Pedramfar, Christopher Quinn, Vaneet Aggarwal
NEURIPS 2025
Efficient Online Set-valued Classification with Bandit Feedback
Zhou Wang, Xingye Qiao
ICML 2024arXiv:2405.04393
1
citations
Federated Combinatorial Multi-Agent Multi-Armed Bandits
Fares Fourati, Mohamed-Slim Alouini, Vaneet Aggarwal
ICML 2024arXiv:2405.05950
8
citations
Handling Heterogeneous Curvatures in Bandit LQR Control
Yu-Hu Yan, Jing Wang, Peng Zhao
ICML 2024spotlight
On Interpolating Experts and Multi-Armed Bandits
Houshuang Chen, Yuchen He, Chihao Zhang
ICML 2024arXiv:2307.07264
5
citations
Performative Prediction with Bandit Feedback: Learning through Reparameterization
Yatong Chen, Wei Tang, Chien-Ju Ho et al.
ICML 2024arXiv:2305.01094
12
citations
Projection-Free Online Convex Optimization with Time-Varying Constraints
Dan Garber, Ben Kretzu
ICML 2024arXiv:2402.08799
5
citations
Provably Efficient Reinforcement Learning for Adversarial Restless Multi-Armed Bandits with Unknown Transitions and Bandit Feedback
GUOJUN XIONG, Jian Li
ICML 2024arXiv:2405.00950
1
citations
Quantum Algorithm for Online Exp-concave Optimization
Jianhao He, Chengchang Liu, Xutong Liu et al.
ICML 2024arXiv:2410.19688
3
citations
Rate-Optimal Policy Optimization for Linear Markov Decision Processes
Uri Sherman, Alon Cohen, Tomer Koren et al.
ICML 2024arXiv:2308.14642
9
citations