"majority voting" Papers
7 papers found
Conference
Cost-aware LLM-based Online Dataset Annotation
Eray Can Elumar, Cem Tekin, Osman Yagan
NEURIPS 2025spotlightarXiv:2505.15101
2
citations
Debate or Vote: Which Yields Better Decisions in Multi-Agent Large Language Models?
Hyeong Kyu Choi, Jerry Zhu, Sharon Li
NEURIPS 2025spotlightarXiv:2508.17536
17
citations
First SFT, Second RL, Third UPT: Continual Improving Multi-Modal LLM Reasoning via Unsupervised Post-Training
Lai Wei, Yuting Li, Chen Wang et al.
NEURIPS 2025arXiv:2505.22453
10
citations
Median Selection with Noisy and Structural Information
Chenglin Fan, Mingyu Kang
NEURIPS 2025
RMB: Comprehensively benchmarking reward models in LLM alignment
Enyu Zhou, Guodong Zheng, Binghai Wang et al.
ICLR 2025arXiv:2410.09893
47
citations
TTRL: Test-Time Reinforcement Learning
Yuxin Zuo, Kaiyan Zhang, Li Sheng et al.
NEURIPS 2025arXiv:2504.16084
129
citations
Value-Guided Search for Efficient Chain-of-Thought Reasoning
Kaiwen Wang, Jin Zhou, Jonathan Chang et al.
NEURIPS 2025arXiv:2505.17373
7
citations