Poster "model quantization" Papers
23 papers found
Conference
BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models
Xingyu Zheng, Xianglong Liu, Haotong Qin et al.
Catastrophic Failure of LLM Unlearning via Quantization
Zhiwei Zhang, Fali Wang, Xiaomin Li et al.
DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion Models
Hyogon Ryu, NaHyeon Park, Hyunjung Shim
From Attention to Activation: Unraveling the Enigmas of Large Language Models
Prannay Kaul, Chengcheng Ma, Ismail Elezi et al.
GPLQ: A General, Practical, and Lightning QAT Method for Vision Transformers
Guang Liang, Xinyao Liu, Jianxin Wu
Pioneering 4-Bit FP Quantization for Diffusion Models: Mixup-Sign Quantization and Timestep-Aware Fine-Tuning
Maosen Zhao, Pengtao Chen, Chong Yu et al.
Robust Machine Unlearning for Quantized Neural Networks via Adaptive Gradient Reweighting with Similar Labels
Yujia Tong, Yuze Wang, Jingling Yuan et al.
Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers
Yunshan Zhong, Yuyao Zhou, Yuxin Zhang et al.
SVDQuant: Absorbing Outliers by Low-Rank Component for 4-Bit Diffusion Models
Muyang Li, Yujun Lin, Zhekai Zhang et al.
SynQ: Accurate Zero-shot Quantization by Synthesis-aware Fine-tuning
Minjun Kim, Jongjin Kim, U Kang
Your Scale Factors are My Weapon: Targeted Bit-Flip Attacks on Vision Transformers via Scale Factor Manipulation
Jialai Wang, Yuxiao Wu, Weiye Xu et al.
Accurate LoRA-Finetuning Quantization of LLMs via Information Retention
Haotong Qin, Xudong Ma, Xingyu Zheng et al.
BiE: Bi-Exponent Block Floating-Point for Large Language Models Quantization
Lancheng Zou, Wenqian Zhao, Shuo Yin et al.
Data Poisoning Quantization Backdoor Attack
Tran Huynh, Anh Tran, Khoa Doan et al.
Memory-Efficient Fine-Tuning for Quantized Diffusion Model
Hyogon Ryu, Seohyun Lim, Hyunjung Shim
MesonGS: Post-training Compression of 3D Gaussians via Efficient Attribute Transformation
Shuzhao Xie, Weixiang Zhang, Chen Tang et al.
MGit: A Model Versioning and Management System
Wei Hao, Daniel Mendoza, Rafael Mendes et al.
Outlier-aware Slicing for Post-Training Quantization in Vision Transformer
Yuexiao Ma, Huixia Li, Xiawu Zheng et al.
PB-LLM: Partially Binarized Large Language Models
Zhihang Yuan, Yuzhang Shang, Zhen Dong
Purifying Quantization-conditioned Backdoors via Layer-wise Activation Correction with Distribution Approximation
Boheng Li, Yishuo Cai, Jisong Cai et al.
Self-Adapting Large Visual-Language Models to Edge Devices across Visual Modalities
Kaiwen Cai, ZheKai Duan, Gaowen Liu et al.
SFC: Achieve Accurate Fast Convolution under Low-precision Arithmetic
Liulu He, yufei zhao, rui gao et al.
Test-Time Model Adaptation with Only Forward Passes
Shuaicheng Niu, Chunyan Miao, Guohao Chen et al.