Paper "low-precision inference" Papers
2 papers found
Conference
Pushing the Limits of BFP on Narrow Precision LLM Inference
Hui Wang, Yuan Cheng, Xiaomeng Han et al.
AAAI 2025paperarXiv:2502.00026
1
citations
OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models
Changhun Lee, Jungyu Jin, Taesu Kim et al.
AAAI 2024paperarXiv:2306.02272
105
citations