"neural network training" Papers

15 papers found

Accelerating neural network training: An analysis of the AlgoPerf competition

Priya Kasimbeg, Frank Schneider, Runa Eschenhagen et al.

ICLR 2025arXiv:2502.15015
19
citations

ADAM Optimization with Adaptive Batch Selection

Gyu Yeol Kim, Min-hwan Oh

ICLR 2025arXiv:2512.06795
2
citations

Block Coordinate Descent for Neural Networks Provably Finds Global Minima

Shunta Akiyama

NEURIPS 2025arXiv:2510.22667
2
citations

Efficient Representativeness-Aware Coreset Selection

Zihao Cheng, Binrui Wu, Zhiwei Li et al.

NEURIPS 2025

KOALA++: Efficient Kalman-Based Optimization with Gradient-Covariance Products

Zixuan XIa, Aram Davtyan, Paolo Favaro

NEURIPS 2025arXiv:2506.04432

Learn2Mix: Training Neural Networks Using Adaptive Data Integration

Shyam Venkatasubramanian, Vahid Tarokh

NEURIPS 2025arXiv:2412.16482
2
citations

Learning High-Degree Parities: The Crucial Role of the Initialization

Emmanuel Abbe, Elisabetta Cornacchia, Jan Hązła et al.

ICLR 2025arXiv:2412.04910
5
citations

Propagation of Chaos for Mean-Field Langevin Dynamics and its Application to Model Ensemble

Atsushi Nitanda, Anzelle Lee, Damian Kai et al.

ICML 2025arXiv:2502.05784
2
citations

Purifying Shampoo: Investigating Shampoo's Heuristics by Decomposing its Preconditioner

Runa Eschenhagen, Aaron Defazio, Tsung-Hsien Lee et al.

NEURIPS 2025spotlightarXiv:2506.03595
10
citations

RTop-K: Ultra-Fast Row-Wise Top-K Selection for Neural Network Acceleration on GPUs

Xi Xie, Yuebo Luo, Hongwu Peng et al.

ICLR 2025arXiv:2409.00822
3
citations

Sinusoidal Initialization, Time for a New Start

Alberto Fernandez-Hernandez, Jose Mestre, Manuel F. Dolz et al.

NEURIPS 2025arXiv:2505.12909
1
citations

Efficient Algorithms for Sum-Of-Minimum Optimization

Lisang Ding, Ziang Chen, Xinshang Wang et al.

ICML 2024arXiv:2402.07070
7
citations

Online Learning and Information Exponents: The Importance of Batch size & Time/Complexity Tradeoffs

Luca Arnaboldi, Yatin Dandi, FLORENT KRZAKALA et al.

ICML 2024

Random Scaling and Momentum for Non-smooth Non-convex Optimization

Qinzi Zhang, Ashok Cutkosky

ICML 2024arXiv:2405.09742
8
citations

Spectral Preconditioning for Gradient Methods on Graded Non-convex Functions

Nikita Doikov, Sebastian Stich, Martin Jaggi

ICML 2024arXiv:2402.04843
8
citations