Poster "exploration-exploitation trade-off" Papers
4 papers found
Conference
Fleet of Agents: Coordinated Problem Solving with Large Language Models
Lars Klein, Nearchos Potamitis, Roland Aydin et al.
ICML 2025arXiv:2405.06691
3
citations
Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models
Yinlam Chow, Guy Tennenholtz, Izzeddin Gur et al.
ICLR 2025arXiv:2412.15287
49
citations
REINFORCE Converges to Optimal Policies with Any Learning Rate
Samuel Robertson, Thang Chu, Bo Dai et al.
NEURIPS 2025
Scalable and Cost-Efficient de Novo Template-Based Molecular Generation
Piotr Gaiński, Oussama Boussif, Andrei Rekesh et al.
NEURIPS 2025arXiv:2506.19865
4
citations