"memory-efficient training" Papers

17 papers found

Filters:memory-efficient training Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

Beyond Random: Automatic Inner-loop Optimization in Dataset Distillation

Muquan Li, Hang Gou, Dongyang Zhang et al.

NEURIPS 2025arXiv:2510.04838

citations

COAP: Memory-Efficient Training with Correlation-Aware Gradient Projection

Jinqi Xiao, Shen Sang, Tiancheng Zhi et al.

CVPR 2025arXiv:2412.00071

citations

COAT: Compressing Optimizer states and Activations for Memory-Efficient FP8 Training

Haocheng Xi, Han Cai, Ligeng Zhu et al.

ICLR 2025arXiv:2410.19313

citations

Efficient Training of Neural Fractional-Order Differential Equation via Adjoint Backpropagation

Qiyu Kang, Xuhao Li, Kai Zhao et al.

AAAI 2025paperarXiv:2503.16666

citations

Gradient Multi-Normalization for Efficient LLM Training

Meyer Scetbon, Chao Ma, Wenbo Gong et al.

NEURIPS 2025

citations

Irrational Complex Rotations Empower Low-bit Optimizers

Zhen Tian, Xin Zhao, Ji-Rong Wen

NEURIPS 2025arXiv:2501.12896

LoRASuite: Efficient LoRA Adaptation Across Large Language Model Upgrades

Yanan Li, Fanxu Meng, Muhan Zhang et al.

NEURIPS 2025arXiv:2505.13515

citations

Mini-batch Coresets for Memory-efficient Language Model Training on Data Mixtures

Dang Nguyen, Wenhan Yang, Rathul Anand et al.

ICLR 2025arXiv:2407.19580

citations

Private Training Large-scale Models with Efficient DP-SGD

Liangyu Wang, Junxiao Wang, Jie Ren et al.

NEURIPS 2025

Second-Order Fine-Tuning without Pain for LLMs: A Hessian Informed Zeroth-Order Optimizer

Yanjun Zhao, Sizhe Dang, Haishan Ye et al.

ICLR 2025

citations

SUMO: Subspace-Aware Moment-Orthogonalization for Accelerating Memory-Efficient LLM Training

Yehonathan Refael, Guy Smorodinsky, Tom Tirer et al.

NEURIPS 2025arXiv:2505.24749

citations

DPZero: Private Fine-Tuning of Language Models without Backpropagation

Liang Zhang, Bingcong Li, Kiran Thekumparampil et al.

ICML 2024arXiv:2310.09639

citations

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Jiawei Zhao, Zhenyu Zhang, Beidi Chen et al.

ICML 2024arXiv:2403.03507

371

citations

Structured Inverse-Free Natural Gradient Descent: Memory-Efficient & Numerically-Stable KFAC

Wu Lin, Felix Dangel, Runa Eschenhagen et al.

ICML 2024

Teddy: Efficient Large-Scale Dataset Distillation via Taylor-Approximated Matching

Ruonan Yu, Songhua Liu, Jingwen Ye et al.

ECCV 2024arXiv:2410.07579

citations

TinyTrain: Resource-Aware Task-Adaptive Sparse Training of DNNs at the Data-Scarce Edge

Young Kwon, Rui Li, Stylianos Venieris et al.

ICML 2024arXiv:2307.09988

citations

ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and Uncertainty in Zeroth-Order Optimization

Shuoran Jiang, Qingcai Chen, Yang Xiang et al.

AAAI 2024paperarXiv:2312.15184

citations