"small language models" Papers

14 papers found

Advantage-Guided Distillation for Preference Alignment in Small Language Models

Shiping Gao, Fanqi Wan, Jiajian Guo et al.

ICLR 2025arXiv:2502.17927
4
citations

Enhancing SQL Query Generation with Neurosymbolic Reasoning

Henrijs Princis, Cristina David, Alan Mycroft

AAAI 2025paperarXiv:2408.13888
4
citations

Hymba: A Hybrid-head Architecture for Small Language Models

Xin Dong, Yonggan Fu, Shizhe Diao et al.

ICLR 2025arXiv:2411.13676
58
citations

It Helps to Take a Second Opinion: Teaching Smaller LLMs To Deliberate Mutually via Selective Rationale Optimisation

Sohan Patnaik, Milan Aggarwal, Sumit Bhatia et al.

ICLR 2025arXiv:2503.02463

Mellow: a small audio language model for reasoning

Soham Deshmukh, Satvik Dixit, Rita Singh et al.

NEURIPS 2025arXiv:2503.08540
19
citations

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solver

Zhenting Qi, Mingyuan MA, Jiahang Xu et al.

ICLR 2025arXiv:2408.06195
129
citations

Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models

Yonggan Fu, Xin Dong, Shizhe Diao et al.

NEURIPS 2025arXiv:2511.18890
2
citations

Readability ≠ Learnability: Rethinking the Role of Simplicity in Training Small Language Models

Ivan Lee, Taylor Berg-Kirkpatrick

COLM 2025paper

SmolLM2: When Smol Goes Big — Data-Centric Training of a Fully Open Small Language Model

Loubna Ben allal, Anton Lozhkov, Elie Bakouch et al.

COLM 2025paper

SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction

Ling Yang, Zhaochen Yu, Tianjun Zhang et al.

ICLR 2025arXiv:2410.09008
15
citations

Unlocking SLM Potential for Data Analysis Code Generation via Non-Parametric Knowledge Distillation

Jinyang Li, Jack Williams, Nick McKenna et al.

NEURIPS 2025

Unveiling the Secret Recipe: A Guide For Supervised Fine-Tuning Small LLMs

Aldo Pareja, Nikhil Shivakumar Nayak, Hao Wang et al.

ICLR 2025arXiv:2412.13337
34
citations

Boosting the Power of Small Multimodal Reasoning Models to Match Larger Models with Self-Consistency Training

Cheng Tan, Jingxuan Wei, Zhangyang Gao et al.

ECCV 2024arXiv:2311.14109
29
citations

Embodied CoT Distillation From LLM To Off-the-shelf Agents

Wonje Choi, Woo Kyung Kim, Minjong Yoo et al.

ICML 2024arXiv:2412.11499
11
citations