Poster "model compression" Papers
104 papers found • Page 1 of 3
Conference
Active Data Curation Effectively Distills Large-Scale Multimodal Models
Vishaal Udandarao, Nikhil Parthasarathy, Muhammad Ferjad Naeem et al.
AugKD: Ingenious Augmentations Empower Knowledge Distillation for Image Super-Resolution
Yun Zhang, Wei Li, Simiao Li et al.
Balanced Token Pruning: Accelerating Vision Language Models Beyond Local Optimization
kaiyuan Li, Xiaoyue Chen, Chen Gao et al.
Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression
Jingcun Wang, Yu-Guang Chen, Ing-Chao Lin et al.
Bi-Directional Communication-Efficient Stochastic FL via Remote Source Generation
Maximilian Egger, Rawad Bitar, Antonia Wachter-Zeh et al.
BillBoard Splatting (BBSplat): Learnable Textured Primitives for Novel View Synthesis
David Svitov, Pietro Morerio, Lourdes Agapito et al.
Composable Interventions for Language Models
Arinbjörn Kolbeinsson, Kyle O'Brien, Tianjin Huang et al.
Computation and Memory-Efficient Model Compression with Gradient Reweighting
Zhiwei Li, Yuesen Liao, Binrui Wu et al.
DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models
Yongqi Huang, Peng Ye, Chenyu Huang et al.
DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture
Qianlong Xiang, Miao Zhang, Yuzhang Shang et al.
DLP: Dynamic Layerwise Pruning in Large Language Models
Yuli Chen, Bo Cheng, Jiale Han et al.
DuoGPT: Training-free Dual Sparsity through Activation-aware Pruning in LLMs
Ruokai Yin, Yuhang Li, Donghyun Lee et al.
EdgeTAM: On-Device Track Anything Model
Chong Zhou, Chenchen Zhu, Yunyang Xiong et al.
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Simla Harma, Ayan Chakraborty, Elizaveta Kostenok et al.
EMLoC: Emulator-based Memory-efficient Fine-tuning with LoRA Correction
Hsi-Che Lin, Yu-Chu Yu, Kai-Po Chang et al.
Fast Feedforward 3D Gaussian Splatting Compression
Yihang Chen, Qianyi Wu, Mengyao Li et al.
FedWSQ: Efficient Federated Learning with Weight Standardization and Distribution-Aware Non-Uniform Quantization
Seung-Wook Kim, Seongyeol Kim, Jiah Kim et al.
FlexGS: Train Once, Deploy Everywhere with Many-in-One Flexible 3D Gaussian Splatting
Hengyu Liu, Yuehao Wang, Chenxin Li et al.
Global Minimizers of $\ell^p$-Regularized Objectives Yield the Sparsest ReLU Neural Networks
Julia Nakhleh, Robert Nowak
Hankel Singular Value Regularization for Highly Compressible State Space Models
Paul Schwerdtner, Jules Berman, Benjamin Peherstorfer
HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models
Seanie Lee, Haebin Seong, Dong Bok Lee et al.
Indirect Gradient Matching for Adversarial Robust Distillation
Hongsin Lee, Seungju Cho, Changick Kim
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
Yuxian Gu, Qinghao Hu, Haocheng Xi et al.
Knowledge Distillation with Multi-granularity Mixture of Priors for Image Super-Resolution
Simiao Li, Yun Zhang, Wei Li et al.
Layer as Puzzle Pieces: Compressing Large Language Models through Layer Concatenation
Fei Wang, Li Shen, Liang Ding et al.
Layer-wise Update Aggregation with Recycling for Communication-Efficient Federated Learning
Jisoo Kim, Sungmin Kang, Sunwoo Lee
LLaMaFlex: Many-in-one LLMs via Generalized Pruning and Weight Sharing
Ruisi Cai, Saurav Muralidharan, Hongxu Yin et al.
LLaVA-KD: A Framework of Distilling Multimodal Large Language Models
Yuxuan Cai, Jiangning Zhang, Haoyang He et al.
MDP: Multidimensional Vision Model Pruning with Latency Constraint
Xinglong Sun, Barath Lakshmanan, Maying Shen et al.
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Wei Huang, Yue Liao, Jianhui Liu et al.
MODEL SHAPLEY: Find Your Ideal Parameter Player via One Gradient Backpropagation
Chu Xu, Xinke Jiang, Rihong Qiu et al.
MosaicDiff: Training-free Structural Pruning for Diffusion Model Acceleration Reflecting Pretraining Dynamics
Bowei Guo, Shengkun Tang, Cong Zeng et al.
One-Shot Knowledge Transfer for Scalable Person Re-Identification
Longhua Li, Lei Qi, Xin Geng
On LLM Knowledge Distillation - A Comparison between Forward KL and Reverse KL
Yihan Cao, Yanbin Kang
Optimal Brain Apoptosis
Mingyuan Sun, Zheng Fang, Jiaxu Wang et al.
Optimization with Access to Auxiliary Information
EL MAHDI CHAYTI, Sai Karimireddy
PALMBENCH: A COMPREHENSIVE BENCHMARK OF COMPRESSED LARGE LANGUAGE MODELS ON MOBILE PLATFORMS
Yilong Li, Jingyu Liu, Hao Zhang et al.
PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution
Zhu Li Bo, Jianze Li, Haotong Qin et al.
PLD: A Choice-Theoretic List-Wise Knowledge Distillation
Ejafa Bassam, Dawei Zhu, Kaigui Bian
PUP 3D-GS: Principled Uncertainty Pruning for 3D Gaussian Splatting
Alex Hanson, Allen Tu, Vasu Singla et al.
Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization
Yamato Arai, Yuma Ichikawa
Quantization without Tears
Minghao Fu, Hao Yu, Jie Shao et al.
Quantized Spike-driven Transformer
Xuerui Qiu, Malu Zhang, Jieyuan Zhang et al.
RSAVQ: Riemannian Sensitivity-Aware Vector Quantization for Large Language Models
Zukang Xu, Xing Hu, Qiang Wu et al.
S$^2$NN: Sub-bit Spiking Neural Networks
Wenjie Wei, Malu Zhang, Jieyuan (Eric) Zhang et al.
SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators
Rasoul Shafipour, David Harrison, Maxwell Horton et al.
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
Xingrun Xing, Boyan Gao, Zheng Liu et al.
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
Peijie Dong, Lujun Li, Yuedong Zhong et al.
Streamlining Redundant Layers to Compress Large Language Models
Xiaodong Chen, Yuxuan Hu, Jing Zhang et al.
Systematic Outliers in Large Language Models
Yongqi An, Xu Zhao, Tao Yu et al.