Oral "reinforcement fine-tuning" Papers
6 papers found
Conference
CollabLLM: From Passive Responders to Active Collaborators
Shirley Wu, Michel Galley, Baolin Peng et al.
ICML 2025oralarXiv:2502.00640
43
citations
EgoThinker: Unveiling Egocentric Reasoning with Spatio-Temporal CoT
Baoqi Pei, Yifei Huang, Jilan Xu et al.
NEURIPS 2025oralarXiv:2510.23569
4
citations
TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs
Yunheng Li, Jing Cheng, Shaoyong Jia et al.
NEURIPS 2025oralarXiv:2509.18056
7
citations
Understanding Data Influence in Reinforcement Finetuning
Haoru Tan, Xiuzhe Wu, Sitong Wu et al.
NEURIPS 2025oral
VideoRFT: Incentivizing Video Reasoning Capability in MLLMs via Reinforced Fine-Tuning
Qi Wang, Yanrui Yu, Ye Yuan et al.
NEURIPS 2025oralarXiv:2505.12434
34
citations
Walking the Tightrope: Autonomous Disentangling Beneficial and Detrimental Drifts in Non-Stationary Custom-Tuning
Xiaoyu Yang, Jie Lu, En Yu
NEURIPS 2025oral
6
citations