"training stability" Papers

14 papers found

2 OLMo 2 Furious (COLM’s Version)

Evan Pete Walsh, Luca Soldaini, Dirk Groeneveld et al.

COLM 2025paper

A Gradient Guidance Perspective on Stepwise Preference Optimization for Diffusion Models

Joshua Tian Jin Tee, Hee Suk Yoon, Abu Hanif Muhammad Syarubany et al.

NEURIPS 2025oral

EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models

Jialiang Cheng, Ning Gao, Yun Yue et al.

ICLR 2025arXiv:2412.07210
1
citations

Improving Neural Optimal Transport via Displacement Interpolation

Jaemoo Choi, Yongxin Chen, Jaewoong Choi

ICLR 2025arXiv:2410.03783
3
citations

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Jingcheng Hu, Yinmin Zhang, Qi Han et al.

NEURIPS 2025arXiv:2503.24290
347
citations

PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs

Oskar van der Wal, Pietro Lesci, Max Müller-Eberstein et al.

ICLR 2025arXiv:2503.09543
16
citations

Quantum Doubly Stochastic Transformers

Jannis Born, Filip Skogh, Kahn Rhrissorrakrai et al.

NEURIPS 2025spotlightarXiv:2504.16275
2
citations

Revisiting Residual Connections: Orthogonal Updates for Stable and Efficient Deep Networks

Giyeong Oh, Woohyun Cho, Siyeol Kim et al.

NEURIPS 2025arXiv:2505.11881

Sinusoidal Initialization, Time for a New Start

Alberto Fernandez-Hernandez, Jose Mestre, Manuel F. Dolz et al.

NEURIPS 2025arXiv:2505.12909
1
citations

Spike2Former: Efficient Spiking Transformer for High-performance Image Segmentation

Zhenxin Lei, Man Yao, Jiakui Hu et al.

AAAI 2025paperarXiv:2412.14587
14
citations

A high-quality robust diffusion framework for corrupted dataset

Quan Dao, Binh Ta, Tung Pham et al.

ECCV 2024arXiv:2311.17101
7
citations

Collage: Light-Weight Low-Precision Strategy for LLM Training

Tao Yu, Gaurav Gupta, KARTHICK GOPALSWAMY et al.

ICML 2024arXiv:2405.03637
8
citations

Learning from Memory: Non-Parametric Memory Augmented Self-Supervised Learning of Visual Features

Thalles Silva, Helio Pedrini, Adín Ramírez Rivera

ICML 2024arXiv:2407.17486
7
citations

Self-Correcting Self-Consuming Loops for Generative Model Training

Nate Gillman, Michael Freeman, Daksh Aggarwal et al.

ICML 2024arXiv:2402.07087
24
citations