"training efficiency" Papers

30 papers found

$100K or 100 Days: Trade-offs when Pre-Training with Academic Resources

Apoorv Khandelwal, Tian Yun, Nihal V. Nayak et al.

COLM 2025paperarXiv:2410.23261
6
citations

A CLIP-Powered Framework for Robust and Generalizable Data Selection

Suorong Yang, Peng Ye, Wanli Ouyang et al.

ICLR 2025arXiv:2410.11215
17
citations

Angles Don’t Lie: Unlocking Training‑Efficient RL Through the Model’s Own Signals

Qinsi Wang, Jinghan Ke, Hancheng Ye et al.

NEURIPS 2025spotlight

BAME: Block-Aware Mask Evolution for Efficient N:M Sparse Training

Chenyi yang, Wenjie Nie, Yuxin Zhang et al.

ICML 2025

Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents

Han Lin, Jaemin Cho, Amir Zadeh et al.

NEURIPS 2025arXiv:2508.05954
6
citations

Cut Your Losses in Large-Vocabulary Language Models

Erik Wijmans, Brody Huval, Alexander Hertzberg et al.

ICLR 2025arXiv:2411.09009
20
citations

DataRater: Meta-Learned Dataset Curation

Dan Andrei Calian, Greg Farquhar, Iurii Kemaev et al.

NEURIPS 2025arXiv:2505.17895
7
citations

Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization

Taishi Nakamura, Takuya Akiba, Kazuki Fujii et al.

ICLR 2025arXiv:2502.19261
9
citations

Efficient Representativeness-Aware Coreset Selection

Zihao Cheng, Binrui Wu, Zhiwei Li et al.

NEURIPS 2025

Faster and Better 3D Splatting via Group Training

Chengbo Wang, Guozheng Ma, Yizhen Lao et al.

ICCV 2025arXiv:2412.07608
3
citations

Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset

Yiqin Yang, Quanwei Wang, Chenghao Li et al.

ICLR 2025arXiv:2502.18955

Improved Noise Schedule for Diffusion Training

Tiankai Hang, Shuyang Gu, Jianmin Bao et al.

ICCV 2025arXiv:2407.03297
33
citations

Improving Transformer Based Line Segment Detection with Matched Predicting and Re-ranking

Xin Tong, Shi Peng, Baojie Tian et al.

AAAI 2025paperarXiv:2502.17766
2
citations

Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better

Enshu Liu, Junyi Zhu, Zinan Lin et al.

ICLR 2025arXiv:2404.02241
6
citations

Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow

Fu-Yun Wang, Ling Yang, Zhaoyang Huang et al.

ICLR 2025arXiv:2410.07303
50
citations

Reinforcement Learning-Guided Data Selection via Redundancy Assessment

Suorong Yang, Peijia Li, Furao Shen et al.

ICCV 2025arXiv:2506.21037
1
citations

Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think

Sihyun Yu, Sangkyung Kwak, Huiwon Jang et al.

ICLR 2025arXiv:2410.06940
342
citations

Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Think

Ge Wu, Shen Zhang, Ruijing Shi et al.

NEURIPS 2025oralarXiv:2507.01467
33
citations

Scale Efficient Training for Large Datasets

Qing Zhou, Junyu Gao, Qi Wang

CVPR 2025arXiv:2503.13385
3
citations

SkyLadder: Better and Faster Pretraining via Context Window Scheduling

Tongyao Zhu, Qian Liu, Haonan Wang et al.

NEURIPS 2025arXiv:2503.15450
3
citations

Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs

Shane Bergsma, Nolan Dey, Gurpreet Gosal et al.

ICLR 2025arXiv:2502.15938
24
citations

TREAD: Token Routing for Efficient Architecture-agnostic Diffusion Training

Felix Krause, Timy Phan, Ming Gui et al.

ICCV 2025arXiv:2501.04765
13
citations

T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning

Yanjun Fu, Faisal Hamman, Sanghamitra Dutta

NEURIPS 2025arXiv:2506.01317
7
citations

VarDrop: Enhancing Training Efficiency by Reducing Variate Redundancy in Periodic Time Series Forecasting

Junhyeok Kang, Yooju Shin, Jae-Gil Lee

AAAI 2025paperarXiv:2501.14183
3
citations

Bucketed Ranking-based Losses for Efficient Training of Object Detectors

Feyza Yavuz, Baris Can Cam, Adnan Harun Dogan et al.

ECCV 2024arXiv:2407.14204
5
citations

BWS: Best Window Selection Based on Sample Scores for Data Pruning across Broad Ranges

Hoyong Choi, Nohyun Ki, Hye Won Chung

ICML 2024arXiv:2406.03057
9
citations

Diversified Batch Selection for Training Acceleration

Feng Hong, Yueming LYU, Jiangchao Yao et al.

ICML 2024arXiv:2406.04872
11
citations

EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism

Yanxi Chen, Xuchen Pan, Yaliang Li et al.

ICML 2024arXiv:2312.04916
60
citations

Ranking-based Client Imitation Selection for Efficient Federated Learning

Chunlin Tian, Zhan Shi, Xinpeng Qin et al.

ICML 2024

Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training Efficiency

Vithursan Thangarasa, Shreyas Saxena, Abhay Gupta et al.

ICML 2024arXiv:2303.11525
8
citations