Poster "model compression" Papers

104 papers found • Page 1 of 3

Active Data Curation Effectively Distills Large-Scale Multimodal Models

Vishaal Udandarao, Nikhil Parthasarathy, Muhammad Ferjad Naeem et al.

CVPR 2025arXiv:2411.18674
15
citations

AugKD: Ingenious Augmentations Empower Knowledge Distillation for Image Super-Resolution

Yun Zhang, Wei Li, Simiao Li et al.

ICLR 2025
3
citations

Balanced Token Pruning: Accelerating Vision Language Models Beyond Local Optimization

kaiyuan Li, Xiaoyue Chen, Chen Gao et al.

NEURIPS 2025arXiv:2505.22038
4
citations

Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression

Jingcun Wang, Yu-Guang Chen, Ing-Chao Lin et al.

ICLR 2025arXiv:2410.03765
21
citations

Bi-Directional Communication-Efficient Stochastic FL via Remote Source Generation

Maximilian Egger, Rawad Bitar, Antonia Wachter-Zeh et al.

NEURIPS 2025

BillBoard Splatting (BBSplat): Learnable Textured Primitives for Novel View Synthesis

David Svitov, Pietro Morerio, Lourdes Agapito et al.

ICCV 2025arXiv:2411.08508
15
citations

Composable Interventions for Language Models

Arinbjörn Kolbeinsson, Kyle O'Brien, Tianjin Huang et al.

ICLR 2025arXiv:2407.06483
5
citations

Computation and Memory-Efficient Model Compression with Gradient Reweighting

Zhiwei Li, Yuesen Liao, Binrui Wu et al.

NEURIPS 2025

DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models

Yongqi Huang, Peng Ye, Chenyu Huang et al.

CVPR 2025arXiv:2503.01359
6
citations

DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture

Qianlong Xiang, Miao Zhang, Yuzhang Shang et al.

CVPR 2025arXiv:2409.03550
19
citations

DLP: Dynamic Layerwise Pruning in Large Language Models

Yuli Chen, Bo Cheng, Jiale Han et al.

ICML 2025arXiv:2505.23807
2
citations

DuoGPT: Training-free Dual Sparsity through Activation-aware Pruning in LLMs

Ruokai Yin, Yuhang Li, Donghyun Lee et al.

NEURIPS 2025arXiv:2506.20194
2
citations

EdgeTAM: On-Device Track Anything Model

Chong Zhou, Chenchen Zhu, Yunyang Xiong et al.

CVPR 2025arXiv:2501.07256
9
citations

Effective Interplay between Sparsity and Quantization: From Theory to Practice

Simla Harma, Ayan Chakraborty, Elizaveta Kostenok et al.

ICLR 2025arXiv:2405.20935
19
citations

EMLoC: Emulator-based Memory-efficient Fine-tuning with LoRA Correction

Hsi-Che Lin, Yu-Chu Yu, Kai-Po Chang et al.

NEURIPS 2025arXiv:2506.12015

Fast Feedforward 3D Gaussian Splatting Compression

Yihang Chen, Qianyi Wu, Mengyao Li et al.

ICLR 2025arXiv:2410.08017
28
citations

FedWSQ: Efficient Federated Learning with Weight Standardization and Distribution-Aware Non-Uniform Quantization

Seung-Wook Kim, Seongyeol Kim, Jiah Kim et al.

ICCV 2025arXiv:2506.23516

FlexGS: Train Once, Deploy Everywhere with Many-in-One Flexible 3D Gaussian Splatting

Hengyu Liu, Yuehao Wang, Chenxin Li et al.

CVPR 2025arXiv:2506.04174
2
citations

Global Minimizers of $\ell^p$-Regularized Objectives Yield the Sparsest ReLU Neural Networks

Julia Nakhleh, Robert Nowak

NEURIPS 2025arXiv:2505.21791

Hankel Singular Value Regularization for Highly Compressible State Space Models

Paul Schwerdtner, Jules Berman, Benjamin Peherstorfer

NEURIPS 2025arXiv:2510.22951
2
citations

HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

Seanie Lee, Haebin Seong, Dong Bok Lee et al.

ICLR 2025arXiv:2410.01524
15
citations

Indirect Gradient Matching for Adversarial Robust Distillation

Hongsin Lee, Seungju Cho, Changick Kim

ICLR 2025arXiv:2312.03286
3
citations

Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search

Yuxian Gu, Qinghao Hu, Haocheng Xi et al.

NEURIPS 2025arXiv:2508.15884
16
citations

Knowledge Distillation with Multi-granularity Mixture of Priors for Image Super-Resolution

Simiao Li, Yun Zhang, Wei Li et al.

ICLR 2025arXiv:2404.02573
4
citations

Layer as Puzzle Pieces: Compressing Large Language Models through Layer Concatenation

Fei Wang, Li Shen, Liang Ding et al.

NEURIPS 2025arXiv:2510.15304
2
citations

Layer-wise Update Aggregation with Recycling for Communication-Efficient Federated Learning

Jisoo Kim, Sungmin Kang, Sunwoo Lee

NEURIPS 2025arXiv:2503.11146
1
citations

LLaMaFlex: Many-in-one LLMs via Generalized Pruning and Weight Sharing

Ruisi Cai, Saurav Muralidharan, Hongxu Yin et al.

ICLR 2025
4
citations

LLaVA-KD: A Framework of Distilling Multimodal Large Language Models

Yuxuan Cai, Jiangning Zhang, Haoyang He et al.

ICCV 2025arXiv:2410.16236
27
citations

MDP: Multidimensional Vision Model Pruning with Latency Constraint

Xinglong Sun, Barath Lakshmanan, Maying Shen et al.

CVPR 2025arXiv:2504.02168
2
citations

Mixture Compressor for Mixture-of-Experts LLMs Gains More

Wei Huang, Yue Liao, Jianhui Liu et al.

ICLR 2025arXiv:2410.06270
24
citations

MODEL SHAPLEY: Find Your Ideal Parameter Player via One Gradient Backpropagation

Chu Xu, Xinke Jiang, Rihong Qiu et al.

NEURIPS 2025

MosaicDiff: Training-free Structural Pruning for Diffusion Model Acceleration Reflecting Pretraining Dynamics

Bowei Guo, Shengkun Tang, Cong Zeng et al.

ICCV 2025arXiv:2510.11962
2
citations

One-Shot Knowledge Transfer for Scalable Person Re-Identification

Longhua Li, Lei Qi, Xin Geng

ICCV 2025arXiv:2511.06016
1
citations

On LLM Knowledge Distillation - A Comparison between Forward KL and Reverse KL

Yihan Cao, Yanbin Kang

ICLR 2025

Optimal Brain Apoptosis

Mingyuan Sun, Zheng Fang, Jiaxu Wang et al.

ICLR 2025arXiv:2502.17941
3
citations

Optimization with Access to Auxiliary Information

EL MAHDI CHAYTI, Sai Karimireddy

ICLR 2025arXiv:2206.00395
14
citations

PALMBENCH: A COMPREHENSIVE BENCHMARK OF COMPRESSED LARGE LANGUAGE MODELS ON MOBILE PLATFORMS

Yilong Li, Jingyu Liu, Hao Zhang et al.

ICLR 2025arXiv:2410.05315
7
citations

PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution

Zhu Li Bo, Jianze Li, Haotong Qin et al.

CVPR 2025arXiv:2411.17106
10
citations

PLD: A Choice-Theoretic List-Wise Knowledge Distillation

Ejafa Bassam, Dawei Zhu, Kaigui Bian

NEURIPS 2025arXiv:2506.12542

PUP 3D-GS: Principled Uncertainty Pruning for 3D Gaussian Splatting

Alex Hanson, Allen Tu, Vasu Singla et al.

CVPR 2025arXiv:2406.10219
40
citations

Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization

Yamato Arai, Yuma Ichikawa

NEURIPS 2025arXiv:2504.09629
11
citations

Quantization without Tears

Minghao Fu, Hao Yu, Jie Shao et al.

CVPR 2025arXiv:2411.13918
16
citations

Quantized Spike-driven Transformer

Xuerui Qiu, Malu Zhang, Jieyuan Zhang et al.

ICLR 2025arXiv:2501.13492
14
citations

RSAVQ: Riemannian Sensitivity-Aware Vector Quantization for Large Language Models

Zukang Xu, Xing Hu, Qiang Wu et al.

NEURIPS 2025arXiv:2510.01240

S$^2$NN: Sub-bit Spiking Neural Networks

Wenjie Wei, Malu Zhang, Jieyuan (Eric) Zhang et al.

NEURIPS 2025arXiv:2509.24266

SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators

Rasoul Shafipour, David Harrison, Maxwell Horton et al.

ICLR 2025arXiv:2410.10714
2
citations

SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking

Xingrun Xing, Boyan Gao, Zheng Liu et al.

ICLR 2025arXiv:2407.04752
23
citations

STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs

Peijie Dong, Lujun Li, Yuedong Zhong et al.

ICLR 2025arXiv:2408.01803
32
citations

Streamlining Redundant Layers to Compress Large Language Models

Xiaodong Chen, Yuxuan Hu, Jing Zhang et al.

ICLR 2025arXiv:2403.19135
19
citations

Systematic Outliers in Large Language Models

Yongqi An, Xu Zhao, Tao Yu et al.

ICLR 2025arXiv:2502.06415
19
citations
PreviousNext