Occlusion-Embedded Hybrid Transformer for Light Field Super-Resolution
Top Authors
Abstract
Transformer-based networks have set new benchmarks in light field super-resolution (SR), but adapting them to capture both global and local spatial-angular correlations efficiently remains challenging. Moreover, many methods fail to account for geometric details like occlusions, leading to performance drops. To tackle these issues, we introduce OHT. This hybrid network leverages occlusion maps through an occlusion-embedded mix layer. It combines the strengths of convolutional networks and Transformers via spatial-angular separable convolution (SASep-Conv) and angular self-attention (ASA). SASep-Conv offers a lightweight alternative to 3D convolution for capturing spatial-angular correlations, while the ASA mechanism applies 3D self-attention across the angular dimension. These designs allow OHT to capture global angular correlations effectively. Extensive experiments on multiple datasets demonstrate OHT's superior performance.