"chain-of-thought training" Papers
2 papers found
Conference
Chiron-o1: Igniting Multimodal Large Language Models towards Generalizable Medical Reasoning via Mentor-Intern Collaborative Search
Haoran Sun, Yankai Jiang, Wenjie Lou et al.
NEURIPS 2025arXiv:2506.16962
6
citations
TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs
Yunheng Li, Jing Cheng, Shaoyong Jia et al.
NEURIPS 2025oralarXiv:2509.18056
7
citations