Poster "regret minimization" Papers

43 papers found

Adapting to Stochastic and Adversarial Losses in Episodic MDPs with Aggregate Bandit Feedback

Shinji Ito, Kevin Jamieson, Haipeng Luo et al.

NEURIPS 2025arXiv:2510.17103
2
citations

Almost Optimal Batch-Regret Tradeoff for Batch Linear Contextual Bandits

Zihan Zhang, Xiangyang Ji, Yuan Zhou

ICLR 2025arXiv:2110.08057
11
citations

An Improved Algorithm for Adversarial Linear Contextual Bandits via Reduction

Tim van Erven, Jack Mayo, Julia Olkhovskaya et al.

NEURIPS 2025arXiv:2508.11931

An Online Learning Theory of Trading-Volume Maximization

Tommaso Cesari, Roberto Colomboni

ICLR 2025
4
citations

Causal LLM Routing: End-to-End Regret Minimization from Observational Data

Asterios Tsiourvas, Wei Sun, Georgia Perakis

NEURIPS 2025arXiv:2505.16037
7
citations

Comparing Uniform Price and Discriminatory Multi-Unit Auctions through Regret Minimization

Marius Potfer, Vianney Perchet

NEURIPS 2025arXiv:2510.19591

Contextual Dynamic Pricing with Heterogeneous Buyers

Thodoris Lykouris, Sloan Nietert, Princewill Okoroafor et al.

NEURIPS 2025arXiv:2512.09513
1
citations

Feature-Based Online Bilateral Trade

Solenne Gaucher, Martino Bernasconi, Matteo Castiglioni et al.

ICLR 2025arXiv:2405.18183
5
citations

Finally Rank-Breaking Conquers MNL Bandits: Optimal and Efficient Algorithms for MNL Assortment

Aadirupa Saha, Pierre Gaillard

ICLR 2025
1
citations

Improved Regret and Contextual Linear Extension for Pandora's Box and Prophet Inequality

Junyan Liu, Ziyun Chen, Kun Wang et al.

NEURIPS 2025arXiv:2505.18828
1
citations

Learning from Imperfect Human Feedback: A Tale from Corruption-Robust Dueling

Yuwei Cheng, Fan Yao, Xuefeng Liu et al.

ICLR 2025arXiv:2405.11204
2
citations

Learning to price with resource constraints: from full information to machine-learned prices

Ruicheng Ao, Jiashuo Jiang, David Simchi-Levi

NEURIPS 2025arXiv:2501.14155
3
citations

Linear Bandits with Memory

Pierre Laforgue, Giulia Clerici, Nicolò Cesa-Bianchi

ICLR 2025arXiv:2302.08345
3
citations

Markov Persuasion Processes: Learning to Persuade From Scratch

Francesco Bacchiocchi, Francesco Emanuele Stradi, Matteo Castiglioni et al.

NEURIPS 2025arXiv:2402.03077
9
citations

Near-Optimal Regret-Queue Length Tradeoff in Online Learning for Two-Sided Markets

Zixian Yang, Sushil Varma, Lei Ying

NEURIPS 2025arXiv:2510.14097

No-Regret Online Autobidding Algorithms in First-price Auctions

Yilin LI, Yuan Deng, Wei Tang et al.

NEURIPS 2025arXiv:2510.16869
1
citations

Online Learning in the Repeated Mediated Newsvendor Problem

Nataša Bolić, Tom Cesari, Roberto Colomboni et al.

NEURIPS 2025

Optimal Regret of Bandits under Differential Privacy

Achraf Azize, Yulian Wu, Junya Honda et al.

NEURIPS 2025

Regretful Decisions under Label Noise

Sujay Nagaraj, Yang Liu, Flavio Calmon et al.

ICLR 2025arXiv:2504.09330
3
citations

REINFORCEMENT LEARNING FOR INDIVIDUAL OPTIMAL POLICY FROM HETEROGENEOUS DATA

Rui Miao, Babak Shahbaba, Annie Qu

NEURIPS 2025arXiv:2505.09496
1
citations

Robust Contextual Pricing

Anupam Gupta, Guru Guruganesh, Renato Leme et al.

NEURIPS 2025

Stable Matching with Ties: Approximation Ratios and Learning

Shiyun Lin, Simon Mauras, Nadav Merlis et al.

NEURIPS 2025arXiv:2411.03270
2
citations

Tightening Regret Lower and Upper Bounds in Restless Rising Bandits

Cristiano Migali, Marco Mussi, Gianmarco Genalti et al.

NEURIPS 2025

Best of Both Worlds Guarantees for Smoothed Online Quadratic Optimization

Neelkamal Bhuyan, Debankur Mukherjee, Adam Wierman

ICML 2024arXiv:2311.00181
4
citations

Decoupling Learning and Decision-Making: Breaking the $\mathcal{O}(\sqrt{T})$ Barrier in Online Resource Allocation with First-Order Methods

Wenzhi Gao, Chunlin Sun, Chenyu Xue et al.

ICML 2024arXiv:2402.07108
3
citations

Eluder-based Regret for Stochastic Contextual MDPs

Orin Levy, Asaf Cassel, Alon Cohen et al.

ICML 2024arXiv:2211.14932
8
citations

Equilibrium of Data Markets with Externality

Safwan Hossain, Yiling Chen

ICML 2024arXiv:2302.08012
6
citations

Graph-Triggered Rising Bandits

Gianmarco Genalti, Marco Mussi, Nicola Gatti et al.

ICML 2024

Improved Differentially Private and Lazy Online Convex Optimization: Lower Regret without Smoothness Requirements

Naman Agarwal, Satyen Kale, Karan Singh et al.

ICML 2024

Incentivized Learning in Principal-Agent Bandit Games

Antoine Scheid, Daniil Tiapkin, Etienne Boursier et al.

ICML 2024arXiv:2403.03811
17
citations

Low-Rank Bandits via Tight Two-to-Infinity Singular Subspace Recovery

Yassir Jedra, William Réveillard, Stefan Stojanovic et al.

ICML 2024arXiv:2402.15739
2
citations

Monotone Individual Fairness

Yahav Bechavod

ICML 2024arXiv:2403.06812
3
citations

Nash Incentive-compatible Online Mechanism Learning via Weakly Differentially Private Online Learning

Joon Suk Huh, Kirthevasan Kandasamy

ICML 2024arXiv:2407.04898
2
citations

Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback

Asaf Cassel, Haipeng Luo, Aviv Rosenberg et al.

ICML 2024arXiv:2405.07637
5
citations

Near-Optimal Reinforcement Learning with Self-Play under Adaptivity Constraints

Dan Qiao, Yu-Xiang Wang

ICML 2024arXiv:2402.01111
4
citations

No-Regret Reinforcement Learning in Smooth MDPs

Davide Maran, Alberto Maria Metelli, Matteo Papini et al.

ICML 2024arXiv:2402.03792
4
citations

Online Learning in CMDPs: Handling Stochastic and Adversarial Constraints

Francesco Emanuele Stradi, Jacopo Germano, Gianmarco Genalti et al.

ICML 2024

Online Learning with Bounded Recall

Jon Schneider, Kiran Vodrahalli

ICML 2024arXiv:2205.14519
1
citations

Online Matrix Completion: A Collaborative Approach with Hott Items

Dheeraj Baby, Soumyabrata Pal

ICML 2024arXiv:2408.05843

Projection-Free Online Convex Optimization with Time-Varying Constraints

Dan Garber, Ben Kretzu

ICML 2024arXiv:2402.08799
5
citations

Quantum Algorithm for Online Exp-concave Optimization

Jianhao He, Chengchang Liu, Xutong Liu et al.

ICML 2024arXiv:2410.19688
3
citations

Rate-Optimal Policy Optimization for Linear Markov Decision Processes

Uri Sherman, Alon Cohen, Tomer Koren et al.

ICML 2024arXiv:2308.14642
9
citations

Test-Time Regret Minimization in Meta Reinforcement Learning

Mirco Mutti, Aviv Tamar

ICML 2024arXiv:2406.02282
4
citations