"text summarization" Papers
9 papers found
Conference
Hansel: Output Length Controlling Framework for Large Language Models
Seoha Song, Junhyun Lee, Hyeonmok Ko
AAAI 2025paperarXiv:2412.14033
1
citations
Learn Your Reference Model for Real Good Alignment
Alexey Gorbatovski, Boris Shaposhnikov, Alexey Malakhov et al.
ICLR 2025arXiv:2404.09656
50
citations
MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions
Yekun Chai, Haoran Sun, Huang Fang et al.
ICLR 2025oralarXiv:2410.02743
9
citations
Neither Valid nor Reliable? Investigating the Use of LLMs as Judges
Khaoula Chehbouni, Mohammed Haddou, Jackie CK Cheung et al.
NEURIPS 2025arXiv:2508.18076
11
citations
On Extending Direct Preference Optimization to Accommodate Ties
Jinghong Chen, Guangyu Yang, Weizhe Lin et al.
NEURIPS 2025arXiv:2409.17431
7
citations
Variational Best-of-N Alignment
Afra Amini, Tim Vieira, Elliott Ash et al.
ICLR 2025arXiv:2407.06057
38
citations
InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining
Boxin Wang, Wei Ping, Lawrence McAfee et al.
ICML 2024arXiv:2310.07713
70
citations
Nash Learning from Human Feedback
REMI MUNOS, Michal Valko, Daniele Calandriello et al.
ICML 2024spotlightarXiv:2312.00886
195
citations
Switchable Decision: Dynamic Neural Generation Networks
Shujian Zhang, Korawat Tanwisuth, Chengyue Gong et al.
ICML 2024arXiv:2405.04513