"variance reduction techniques" Papers
5 papers found
Conference
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng, Haochen Zhang, Lingzhou Xue
ICLR 2025arXiv:2410.07574
9
citations
Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient
Hao Di, Haishan Ye, Yueling Zhang et al.
ICML 2024spotlightarXiv:2405.17761
2
citations
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
Ziniu Li, Tian Xu, Yushun Zhang et al.
ICML 2024arXiv:2310.10505
147
citations
UNEX-RL: Reinforcing Long-Term Rewards in Multi-Stage Recommender Systems with UNidirectional EXecution
Gengrui Zhang, Xiaoshuang Chen, Yao WANG et al.
AAAI 2024paperarXiv:2401.06470
11
citations
Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models
Tanmay Gautam, Youngsuk Park, Hao Zhou et al.
ICML 2024arXiv:2404.08080
39
citations