"model compression" Papers

128 papers found • Page 3 of 3

Fluctuation-Based Adaptive Structured Pruning for Large Language Models

Yongqi An, Xu Zhao, Tao Yu et al.

AAAI 2024paperarXiv:2312.11983
106
citations

FrameQuant: Flexible Low-Bit Quantization for Transformers

Harshavardhan Adepu, Zhanpeng Zeng, Li Zhang et al.

ICML 2024arXiv:2403.06082
16
citations

Generative Model-Based Feature Knowledge Distillation for Action Recognition

Guiqin Wang, Peng Zhao, Yanjiang Shi et al.

AAAI 2024paperarXiv:2312.08644
7
citations

Good Teachers Explain: Explanation-Enhanced Knowledge Distillation

Amin Parchami, Moritz Böhle, Sukrut Rao et al.

ECCV 2024arXiv:2402.03119
19
citations

How Far Can We Compress Instant-NGP-Based NeRF?

Yihang Chen, Qianyi Wu, Mehrtash Harandi et al.

CVPR 2024arXiv:2406.04101
34
citations

Instance-Aware Group Quantization for Vision Transformers

Jaehyeon Moon, Dohyung Kim, Jun Yong Cheon et al.

CVPR 2024arXiv:2404.00928
15
citations

Junk DNA Hypothesis: Pruning Small Pre-Trained Weights $\textit{Irreversibly}$ and $\textit{Monotonically}$ Impairs ``Difficult" Downstream Tasks in LLMs

Lu Yin, Ajay Jaiswal, Shiwei Liu et al.

ICML 2024

KernelWarehouse: Rethinking the Design of Dynamic Convolution

Chao Li, Anbang Yao

ICML 2024arXiv:2406.07879
9
citations

Lightweight Image Super-Resolution via Flexible Meta Pruning

Yulun Zhang, Kai Zhang, Luc Van Gool et al.

ICML 2024

Localizing Task Information for Improved Model Merging and Compression

Ke Wang, Nikolaos Dimitriadis, Guillermo Ortiz-Jimenez et al.

ICML 2024arXiv:2405.07813
92
citations

LORS: Low-rank Residual Structure for Parameter-Efficient Network Stacking

Jialin Li, Qiang Nie, Weifu Fu et al.

CVPR 2024highlightarXiv:2403.04303
7
citations

MoEAD: A Parameter-efficient Model for Multi-class Anomaly Detection

Shiyuan Meng, Wenchao Meng, Qihang Zhou et al.

ECCV 2024
14
citations

Neural Metamorphosis

Xingyi Yang, Xinchao Wang

ECCV 2024arXiv:2410.11878
5
citations

On the Road to Portability: Compressing End-to-End Motion Planner for Autonomous Driving

Kaituo Feng, Changsheng Li, Dongchun Ren et al.

CVPR 2024arXiv:2403.01238
15
citations

OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models

Changhun Lee, Jungyu Jin, Taesu Kim et al.

AAAI 2024paperarXiv:2306.02272
105
citations

PELA: Learning Parameter-Efficient Models with Low-Rank Approximation

Yangyang Guo, Guangzhi Wang, Mohan Kankanhalli

CVPR 2024arXiv:2310.10700
10
citations

Progressive Distillation Based on Masked Generation Feature Method for Knowledge Graph Completion

Cunhang Fan, Yujie Chen, Jun Xue et al.

AAAI 2024paperarXiv:2401.12997
5
citations

Pruner-Zero: Evolving Symbolic Pruning Metric From Scratch for Large Language Models

Peijie Dong, Lujun Li, Zhenheng Tang et al.

ICML 2024arXiv:2406.02924
54
citations

PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation

Yizhe Xiong, Hui Chen, Tianxiang Hao et al.

ECCV 2024arXiv:2403.09192
26
citations

Rethinking Optimization and Architecture for Tiny Language Models

Yehui Tang, Kai Han, Fangcheng Liu et al.

ICML 2024

Reweighted Solutions for Weighted Low Rank Approximation

David Woodruff, Taisuke Yasuda

ICML 2024arXiv:2406.02431
2
citations

SAGS: Structure-Aware 3D Gaussian Splatting

Evangelos Ververas, Rolandos Alexandros Potamias, Song Jifei et al.

ECCV 2024arXiv:2404.19149
21
citations

SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks

Jiwon Song, Kyungseok Oh, Taesu Kim et al.

ICML 2024arXiv:2402.09025
73
citations

SNP: Structured Neuron-level Pruning to Preserve Attention Scores

Kyunghwan Shim, Jaewoong Yun, Shinkook Choi

ECCV 2024arXiv:2404.11630
3
citations

Soft Prompt Recovers Compressed LLMs, Transferably

Zhaozhuo Xu, Zirui Liu, Beidi Chen et al.

ICML 2024

Towards efficient deep spiking neural networks construction with spiking activity based pruning

Yaxin Li, Qi Xu, Jiangrong Shen et al.

ICML 2024arXiv:2406.01072
15
citations

Transferable and Principled Efficiency for Open-Vocabulary Segmentation

Jingxuan Xu, Wuyang Chen, Yao Zhao et al.

CVPR 2024arXiv:2404.07448
1
citations

Transferring Knowledge From Large Foundation Models to Small Downstream Models

Shikai Qiu, Boran Han, Danielle Robinson et al.

ICML 2024arXiv:2406.07337
8
citations