"model compression" Papers

128 papers found • Page 1 of 3

Active Data Curation Effectively Distills Large-Scale Multimodal Models

Vishaal Udandarao, Nikhil Parthasarathy, Muhammad Ferjad Naeem et al.

CVPR 2025arXiv:2411.18674
15
citations

AugKD: Ingenious Augmentations Empower Knowledge Distillation for Image Super-Resolution

Yun Zhang, Wei Li, Simiao Li et al.

ICLR 2025
3
citations

Balanced Token Pruning: Accelerating Vision Language Models Beyond Local Optimization

kaiyuan Li, Xiaoyue Chen, Chen Gao et al.

NEURIPS 2025arXiv:2505.22038
4
citations

Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression

Jingcun Wang, Yu-Guang Chen, Ing-Chao Lin et al.

ICLR 2025arXiv:2410.03765
21
citations

Bi-Directional Communication-Efficient Stochastic FL via Remote Source Generation

Maximilian Egger, Rawad Bitar, Antonia Wachter-Zeh et al.

NEURIPS 2025

BillBoard Splatting (BBSplat): Learnable Textured Primitives for Novel View Synthesis

David Svitov, Pietro Morerio, Lourdes Agapito et al.

ICCV 2025arXiv:2411.08508
15
citations

CASP: Compression of Large Multimodal Models Based on Attention Sparsity

Mohsen Gholami, Mohammad Akbari, Kevin Cannons et al.

CVPR 2025highlightarXiv:2503.05936
4
citations

Composable Interventions for Language Models

Arinbjörn Kolbeinsson, Kyle O'Brien, Tianjin Huang et al.

ICLR 2025arXiv:2407.06483
5
citations

Computation and Memory-Efficient Model Compression with Gradient Reweighting

Zhiwei Li, Yuesen Liao, Binrui Wu et al.

NEURIPS 2025

DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models

Yongqi Huang, Peng Ye, Chenyu Huang et al.

CVPR 2025arXiv:2503.01359
6
citations

DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture

Qianlong Xiang, Miao Zhang, Yuzhang Shang et al.

CVPR 2025arXiv:2409.03550
19
citations

DLP: Dynamic Layerwise Pruning in Large Language Models

Yuli Chen, Bo Cheng, Jiale Han et al.

ICML 2025arXiv:2505.23807
2
citations

DuoGPT: Training-free Dual Sparsity through Activation-aware Pruning in LLMs

Ruokai Yin, Yuhang Li, Donghyun Lee et al.

NEURIPS 2025arXiv:2506.20194
2
citations

Dynamical Low-Rank Compression of Neural Networks with Robustness under Adversarial Attacks

Steffen Schotthöfer, Lexie Yang, Stefan Schnake

NEURIPS 2025oralarXiv:2505.08022
6
citations

EdgeTAM: On-Device Track Anything Model

Chong Zhou, Chenchen Zhu, Yunyang Xiong et al.

CVPR 2025arXiv:2501.07256
9
citations

Effective Interplay between Sparsity and Quantization: From Theory to Practice

Simla Harma, Ayan Chakraborty, Elizaveta Kostenok et al.

ICLR 2025arXiv:2405.20935
19
citations

EMLoC: Emulator-based Memory-efficient Fine-tuning with LoRA Correction

Hsi-Che Lin, Yu-Chu Yu, Kai-Po Chang et al.

NEURIPS 2025arXiv:2506.12015

Extracting Interpretable Task-Specific Circuits from Large Language Models for Faster Inference

Jorge García-Carrasco, Alejandro Maté, Juan Trujillo

AAAI 2025paperarXiv:2412.15750
3
citations

Fast Feedforward 3D Gaussian Splatting Compression

Yihang Chen, Qianyi Wu, Mengyao Li et al.

ICLR 2025arXiv:2410.08017
28
citations

FedWSQ: Efficient Federated Learning with Weight Standardization and Distribution-Aware Non-Uniform Quantization

Seung-Wook Kim, Seongyeol Kim, Jiah Kim et al.

ICCV 2025arXiv:2506.23516

FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation

Zhuguanyu Wu, Shihe Wang, Jiayi Zhang et al.

CVPR 2025highlightarXiv:2506.11543
6
citations

FlexGS: Train Once, Deploy Everywhere with Many-in-One Flexible 3D Gaussian Splatting

Hengyu Liu, Yuehao Wang, Chenxin Li et al.

CVPR 2025arXiv:2506.04174
2
citations

Free Lunch in the Forest: Functionally-Identical Pruning of Boosted Tree Ensembles

Youssouf Emine, Alexandre Forel, Idriss Malek et al.

AAAI 2025paperarXiv:2408.16167
2
citations

Global Minimizers of $\ell^p$-Regularized Objectives Yield the Sparsest ReLU Neural Networks

Julia Nakhleh, Robert Nowak

NEURIPS 2025arXiv:2505.21791

Hankel Singular Value Regularization for Highly Compressible State Space Models

Paul Schwerdtner, Jules Berman, Benjamin Peherstorfer

NEURIPS 2025arXiv:2510.22951
2
citations

HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

Seanie Lee, Haebin Seong, Dong Bok Lee et al.

ICLR 2025arXiv:2410.01524
15
citations

Indirect Gradient Matching for Adversarial Robust Distillation

Hongsin Lee, Seungju Cho, Changick Kim

ICLR 2025arXiv:2312.03286
3
citations

Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search

Yuxian Gu, Qinghao Hu, Haocheng Xi et al.

NEURIPS 2025arXiv:2508.15884
16
citations

Knowledge Distillation with Multi-granularity Mixture of Priors for Image Super-Resolution

Simiao Li, Yun Zhang, Wei Li et al.

ICLR 2025arXiv:2404.02573
4
citations

Layer as Puzzle Pieces: Compressing Large Language Models through Layer Concatenation

Fei Wang, Li Shen, Liang Ding et al.

NEURIPS 2025arXiv:2510.15304
2
citations

Layer-wise Update Aggregation with Recycling for Communication-Efficient Federated Learning

Jisoo Kim, Sungmin Kang, Sunwoo Lee

NEURIPS 2025arXiv:2503.11146
1
citations

LLaMaFlex: Many-in-one LLMs via Generalized Pruning and Weight Sharing

Ruisi Cai, Saurav Muralidharan, Hongxu Yin et al.

ICLR 2025
4
citations

LLaVA-KD: A Framework of Distilling Multimodal Large Language Models

Yuxuan Cai, Jiangning Zhang, Haoyang He et al.

ICCV 2025arXiv:2410.16236
27
citations

MDP: Multidimensional Vision Model Pruning with Latency Constraint

Xinglong Sun, Barath Lakshmanan, Maying Shen et al.

CVPR 2025arXiv:2504.02168
2
citations

MeRino: Entropy-Driven Design for Generative Language Models on IoT Devices

Youpeng Zhao, Ming Lin, Huadong Tang et al.

AAAI 2025paperarXiv:2403.07921
1
citations

Mixture Compressor for Mixture-of-Experts LLMs Gains More

Wei Huang, Yue Liao, Jianhui Liu et al.

ICLR 2025arXiv:2410.06270
24
citations

MODEL SHAPLEY: Find Your Ideal Parameter Player via One Gradient Backpropagation

Chu Xu, Xinke Jiang, Rihong Qiu et al.

NEURIPS 2025

MosaicDiff: Training-free Structural Pruning for Diffusion Model Acceleration Reflecting Pretraining Dynamics

Bowei Guo, Shengkun Tang, Cong Zeng et al.

ICCV 2025arXiv:2510.11962
2
citations

Numerical Pruning for Efficient Autoregressive Models

Xuan Shen, Zhao Song, Yufa Zhou et al.

AAAI 2025paperarXiv:2412.12441
23
citations

One-Shot Knowledge Transfer for Scalable Person Re-Identification

Longhua Li, Lei Qi, Xin Geng

ICCV 2025arXiv:2511.06016
1
citations

On LLM Knowledge Distillation - A Comparison between Forward KL and Reverse KL

Yihan Cao, Yanbin Kang

ICLR 2025

Optimal Brain Apoptosis

Mingyuan Sun, Zheng Fang, Jiaxu Wang et al.

ICLR 2025arXiv:2502.17941
3
citations

Optimization with Access to Auxiliary Information

EL MAHDI CHAYTI, Sai Karimireddy

ICLR 2025arXiv:2206.00395
14
citations

PALMBENCH: A COMPREHENSIVE BENCHMARK OF COMPRESSED LARGE LANGUAGE MODELS ON MOBILE PLATFORMS

Yilong Li, Jingyu Liu, Hao Zhang et al.

ICLR 2025arXiv:2410.05315
7
citations

PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution

Zhu Li Bo, Jianze Li, Haotong Qin et al.

CVPR 2025arXiv:2411.17106
10
citations

PLD: A Choice-Theoretic List-Wise Knowledge Distillation

Ejafa Bassam, Dawei Zhu, Kaigui Bian

NEURIPS 2025arXiv:2506.12542

Pruning Large Language Models with Semi-Structural Adaptive Sparse Training

Weiyu Huang, Yuezhou Hu, Guohao Jian et al.

AAAI 2025paperarXiv:2407.20584
21
citations

PUP 3D-GS: Principled Uncertainty Pruning for 3D Gaussian Splatting

Alex Hanson, Allen Tu, Vasu Singla et al.

CVPR 2025arXiv:2406.10219
40
citations

Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization

Yamato Arai, Yuma Ichikawa

NEURIPS 2025arXiv:2504.09629
11
citations

Quantization without Tears

Minghao Fu, Hao Yu, Jie Shao et al.

CVPR 2025arXiv:2411.13918
16
citations
PreviousNext