"dense prediction tasks" Papers
19 papers found
Conference
DAViD: Data-efficient and Accurate Vision Models from Synthetic Data
Fatemeh Saleh, Sadegh Aliakbarian, Charlie Hewitt et al.
DiTASK: Multi-Task Fine-Tuning with Diffeomorphic Transformations
Krishna Sri Ipsit Mantri, Carola-Bibiane Schönlieb, Bruno Ribeiro et al.
Exploring Structural Degradation in Dense Representations for Self-supervised Learning
Siran Dai, Qianqian Xu, Peisong Wen et al.
Learning Yourself: Class-Incremental Semantic Segmentation with Language-Inspired Bootstrapped Disentanglement
Ruitao Wu, Yifan Zhao, Jia Li
Multi-Task Dense Predictions via Unleashing the Power of Diffusion
Yuqi Yang, Peng-Tao Jiang, Qibin Hou et al.
Refining CLIP's Spatial Awareness: A Visual-Centric Perspective
Congpei Qiu, Yanhao Wu, Wei Ke et al.
Test-Time Adaptation of Vision-Language Models for Open-Vocabulary Semantic Segmentation
Mehrdad Noori, David OSOWIECHI, Gustavo Vargas Hakim et al.
Unbiased Region-Language Alignment for Open-Vocabulary Dense Prediction
Yunheng Li, Yuxuan Li, Quan-Sheng Zeng et al.
Denoising Vision Transformers
Jiawei Yang, Katie Luo, Jiefeng Li et al.
Efficient Learning of Event-based Dense Representation using Hierarchical Memories with Adaptive Update
Uday Kamal, Saibal Mukhopadhyay
Event Camera Data Dense Pre-training
Yan Yang, Liyuan Pan, Liu liu
Exploiting Diffusion Prior for Generalizable Dense Prediction
Hsin-Ying Lee, Hung-Yu Tseng, Hsin-Ying Lee et al.
Improving Visual Recognition with Hyperbolical Visual Hierarchy Mapping
Hyeongjun Kwon, Jinhyun Jang, Jin Kim et al.
Removing Rows and Columns of Tokens in Vision Transformer enables Faster Dense Prediction without Retraining
Diwei Su, cheng fei, Jianxu Luo
Semi-supervised Active Learning for Video Action Detection
Ayush Singh, Aayush J Rana, Akash Kumar et al.
SILC: Improving Vision Language Pretraining with Self-Distillation
Muhammad Ferjad Naeem, Yongqin Xian, Xiaohua Zhai et al.
Stitched ViTs are Flexible Vision Backbones
Zizheng Pan, Jing Liu, Haoyu He et al.
Unsqueeze [CLS] Bottleneck to Learn Rich Representations
Qing Su, Shihao Ji
ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions
Chunlong Xia, Xinliang Wang, Feng Lv et al.