"training efficiency" Papers

30 papers found

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

$100K or 100 Days: Trade-offs when Pre-Training with Academic Resources

Apoorv Khandelwal, Tian Yun, Nihal V. Nayak et al.

COLM 2025paperarXiv:2410.23261

citations

A CLIP-Powered Framework for Robust and Generalizable Data Selection

Suorong Yang, Peng Ye, Wanli Ouyang et al.

ICLR 2025arXiv:2410.11215

citations

Angles Don’t Lie: Unlocking Training‑Efficient RL Through the Model’s Own Signals

Qinsi Wang, Jinghan Ke, Hancheng Ye et al.

NEURIPS 2025spotlight

BAME: Block-Aware Mask Evolution for Efficient N:M Sparse Training

Chenyi yang, Wenjie Nie, Yuxin Zhang et al.

ICML 2025

Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents

Han Lin, Jaemin Cho, Amir Zadeh et al.

NEURIPS 2025arXiv:2508.05954

citations

Cut Your Losses in Large-Vocabulary Language Models

Erik Wijmans, Brody Huval, Alexander Hertzberg et al.

ICLR 2025arXiv:2411.09009

citations

DataRater: Meta-Learned Dataset Curation

Dan Andrei Calian, Greg Farquhar, Iurii Kemaev et al.

NEURIPS 2025arXiv:2505.17895

citations

Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization

Taishi Nakamura, Takuya Akiba, Kazuki Fujii et al.

ICLR 2025arXiv:2502.19261

citations

Efficient Representativeness-Aware Coreset Selection

Zihao Cheng, Binrui Wu, Zhiwei Li et al.

NEURIPS 2025

Faster and Better 3D Splatting via Group Training

Chengbo Wang, Guozheng Ma, Yizhen Lao et al.

ICCV 2025arXiv:2412.07608

citations

Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset

Yiqin Yang, Quanwei Wang, Chenghao Li et al.

ICLR 2025arXiv:2502.18955

Improved Noise Schedule for Diffusion Training

Tiankai Hang, Shuyang Gu, Jianmin Bao et al.

ICCV 2025arXiv:2407.03297

citations

Improving Transformer Based Line Segment Detection with Matched Predicting and Re-ranking

Xin Tong, Shi Peng, Baojie Tian et al.

AAAI 2025paperarXiv:2502.17766

citations

Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better

Enshu Liu, Junyi Zhu, Zinan Lin et al.

ICLR 2025arXiv:2404.02241

citations

Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow

Fu-Yun Wang, Ling Yang, Zhaoyang Huang et al.

ICLR 2025arXiv:2410.07303

citations

Reinforcement Learning-Guided Data Selection via Redundancy Assessment

Suorong Yang, Peijia Li, Furao Shen et al.

ICCV 2025arXiv:2506.21037

citations

Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think

Sihyun Yu, Sangkyung Kwak, Huiwon Jang et al.

ICLR 2025arXiv:2410.06940

342

citations

Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Think

Ge Wu, Shen Zhang, Ruijing Shi et al.

NEURIPS 2025oralarXiv:2507.01467

citations

Scale Efficient Training for Large Datasets

Qing Zhou, Junyu Gao, Qi Wang

CVPR 2025arXiv:2503.13385

citations

SkyLadder: Better and Faster Pretraining via Context Window Scheduling

Tongyao Zhu, Qian Liu, Haonan Wang et al.

NEURIPS 2025arXiv:2503.15450

citations

Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs

Shane Bergsma, Nolan Dey, Gurpreet Gosal et al.

ICLR 2025arXiv:2502.15938

citations

TREAD: Token Routing for Efficient Architecture-agnostic Diffusion Training

Felix Krause, Timy Phan, Ming Gui et al.

ICCV 2025arXiv:2501.04765

citations

T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning

Yanjun Fu, Faisal Hamman, Sanghamitra Dutta

NEURIPS 2025arXiv:2506.01317

citations

VarDrop: Enhancing Training Efficiency by Reducing Variate Redundancy in Periodic Time Series Forecasting

Junhyeok Kang, Yooju Shin, Jae-Gil Lee

AAAI 2025paperarXiv:2501.14183

citations

Bucketed Ranking-based Losses for Efficient Training of Object Detectors

Feyza Yavuz, Baris Can Cam, Adnan Harun Dogan et al.

ECCV 2024arXiv:2407.14204

citations

BWS: Best Window Selection Based on Sample Scores for Data Pruning across Broad Ranges

Hoyong Choi, Nohyun Ki, Hye Won Chung

ICML 2024arXiv:2406.03057

citations

Diversified Batch Selection for Training Acceleration

Feng Hong, Yueming LYU, Jiangchao Yao et al.

ICML 2024arXiv:2406.04872

citations

EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism

Yanxi Chen, Xuchen Pan, Yaliang Li et al.

ICML 2024arXiv:2312.04916

citations

Ranking-based Client Imitation Selection for Efficient Federated Learning

Chunlin Tian, Zhan Shi, Xinpeng Qin et al.

ICML 2024

Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training Efficiency

Vithursan Thangarasa, Shreyas Saxena, Abhay Gupta et al.

ICML 2024arXiv:2303.11525

citations