Efficient Attention-Sharing Information Distillation Transformer for Lightweight Single Image Super-Resolution

10citations

arXiv:2501.15774 PDF Project

citations

#443

in AAAI 2025

of 3028 papers

Top Authors

Data Points

Top Authors

Karam Park Jae Woong Soh Nam Ik Cho

Topics

single image super-resolution attention mechanism transformer architecture lightweight networks information distillation computational complexity long-range dependencies self-attention layers

Abstract

Transformer-based Super-Resolution (SR) methods have demonstrated superior performance compared to convolutional neural network (CNN)-based SR approaches due to their capability to capture long-range dependencies. However, their high computational complexity necessitates the development of lightweight approaches for practical use. To address this challenge, we propose the Attention-Sharing Information Distillation (ASID) network, a lightweight SR network that integrates attention-sharing and an information distillation structure specifically designed for Transformer-based SR methods. We modify the information distillation scheme, originally designed for efficient CNN operations, to reduce the computational load of stacked self-attention layers, effectively addressing the efficiency bottleneck. Additionally, we introduce attention-sharing across blocks to further minimize the computational cost of self-attention operations. By combining these strategies, ASID achieves competitive performance with existing SR methods while requiring only around 300K parameters - significantly fewer than existing CNN-based and Transformer-based SR models. Furthermore, ASID outperforms state-of-the-art SR methods when the number of parameters is matched, demonstrating its efficiency and effectiveness. The code and supplementary material are available on the project page.

Citation History

Jan 27, 2026

Feb 13, 2026