"post-training optimization" Papers
6 papers found
Conference
Advancing Expert Specialization for Better MoE
Hongcan Guo, Haolang Lu, Guoshun Nan et al.
NEURIPS 2025oralarXiv:2505.22323
10
citations
Shortcutting Pre-trained Flow Matching Diffusion Models is Almost Free Lunch
Xu Cai, Yang Wu, Qianli Chen et al.
NEURIPS 2025arXiv:2510.17858
Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective
Zeyu Gan, Yong Liu
ICLR 2025arXiv:2410.01720
15
citations
Wasserstein Distances, Neuronal Entanglement, and Sparsity
Shashata Sawmya, Linghao Kong, Ilia Markov et al.
ICLR 2025arXiv:2405.15756
5
citations
Flextron: Many-in-One Flexible Large Language Model
Ruisi Cai, Saurav Muralidharan, Greg Heinrich et al.
ICML 2024arXiv:2406.10260
34
citations
VNN: Verification-Friendly Neural Networks with Hard Robustness Guarantees
Anahita Baninajjar, Ahmed Rezine, Amir Aminifar
ICML 2024arXiv:2312.09748
1
citations