Poster "cross-attention mechanism" Papers

30 papers found

A Conditional Probability Framework for Compositional Zero-shot Learning

Peng Wu, Qiuxia Lai, Hao Fang et al.

ICCV 2025arXiv:2507.17377
1
citations

Causal Disentanglement and Cross-Modal Alignment for Enhanced Few-Shot Learning

Tianjiao Jiang, Zhen Zhang, Yuhang Liu et al.

ICCV 2025arXiv:2508.03102
1
citations

CLOT: Closed Loop Optimal Transport for Unsupervised Action Segmentation

Elena Bueno-Benito, Mariella Dimiccoli

ICCV 2025arXiv:2507.03539

Concept Replacer: Replacing Sensitive Concepts in Diffusion Models via Precision Localization

lingyun zhang, Yu Xie, Yanwei Fu et al.

CVPR 2025arXiv:2412.01244
5
citations

ConceptSplit: Decoupled Multi-Concept Personalization of Diffusion Models via Token-wise Adaptation and Attention Disentanglement

Habin Lim, Youngseob Won, Juwon Seo et al.

ICCV 2025arXiv:2510.04668

CrossMPT: Cross-attention Message-passing Transformer for Error Correcting Codes

Seong-Joon Park, Hee-Youl Kwak, Sang-Hyo Kim et al.

ICLR 2025arXiv:2405.01033
20
citations

DiffE2E: Rethinking End-to-End Driving with a Hybrid Diffusion-Regression-Classification Policy

Rui Zhao, Yuze Fan, Ziguo Chen et al.

NEURIPS 2025

FedMVP: Federated Multimodal Visual Prompt Tuning for Vision-Language Models

Mainak Singha, Subhankar Roy, Sarthak Mehrotra et al.

ICCV 2025arXiv:2504.20860
2
citations

HiLoTs: High-Low Temporal Sensitive Representation Learning for Semi-Supervised LiDAR Segmentation in Autonomous Driving

R.D. Lin, Pengcheng Weng, Yinqiao Wang et al.

CVPR 2025arXiv:2503.17752
1
citations

Image Referenced Sketch Colorization Based on Animation Creation Workflow

Dingkun Yan, Xinrui Wang, Zhuoru Li et al.

CVPR 2025arXiv:2502.19937
3
citations

InstantSwap: Fast Customized Concept Swapping across Sharp Shape Differences

Chenyang Zhu, Kai Li, Yue Ma et al.

ICLR 2025arXiv:2412.01197
30
citations

LLaFEA: Frame-Event Complementary Fusion for Fine-Grained Spatiotemporal Understanding in LMMs

Hanyu Zhou, Gim Hee Lee

ICCV 2025arXiv:2503.06934
3
citations

LLaVA-SP: Enhancing Visual Representation with Visual Spatial Tokens for MLLMs

Haoran Lou, Chunxiao Fan, Ziyan Liu et al.

ICCV 2025arXiv:2507.00505

MoFRR: Mixture of Diffusion Models for Face Retouching Restoration

Jiaxin Liu, Qichao Ying, Zhenxing Qian et al.

ICCV 2025arXiv:2507.19770

Progressive Compositionality in Text-to-Image Generative Models

Xu Han, Linghao Jin, Xiaofeng Liu et al.

ICLR 2025arXiv:2410.16719
9
citations

RadZero: Similarity-Based Cross-Attention for Explainable Vision-Language Alignment in Chest X-ray with Zero-Shot Multi-Task Capability

Jonggwon Park, Byungmu Yoon, Soobum Kim et al.

NEURIPS 2025arXiv:2504.07416
1
citations

RC-AutoCalib: An End-to-End Radar-Camera Automatic Calibration Network

Van-Tin Luu, Yong-Lin Cai, Vu-Hoang Tran et al.

CVPR 2025arXiv:2505.22427
3
citations

Towards Single-Source Domain Generalized Object Detection via Causal Visual Prompts

Chen Li, Huiying Xu, Changxin Gao et al.

NEURIPS 2025arXiv:2510.19487

Translation of Text Embedding via Delta Vector to Suppress Strongly Entangled Content in Text-to-Image Diffusion Models

Eunseo Koh, SeungHoo Hong, Tae-Young Kim et al.

ICCV 2025arXiv:2508.10407

UNIC-Adapter: Unified Image-instruction Adapter with Multi-modal Transformer for Image Generation

Lunhao Duan, Shanshan Zhao, Wenjun Yan et al.

CVPR 2025arXiv:2412.18928
7
citations

X-NeMo: Expressive Neural Motion Reenactment via Disentangled Latent Attention

XiaoChen Zhao, Hongyi Xu, Guoxian Song et al.

ICLR 2025arXiv:2507.23143
20
citations

Elegantly Written: Disentangling Writer and Character Styles for Enhancing Online Chinese Handwriting

Yu Liu, Fatimah binti Khalid, Lei Wang et al.

ECCV 2024
7
citations

Image Fusion via Vision-Language Model

Zixiang Zhao, Lilun Deng, Haowen Bai et al.

ICML 2024arXiv:2402.02235
67
citations

Meta Evidential Transformer for Few-Shot Open-Set Recognition

Hitesh Sapkota, Krishna Neupane, Qi Yu

ICML 2024

NVS-Adapter: Plug-and-Play Novel View Synthesis from a Single Image

Yoonwoo Jeong, Jinwoo Lee, Chiheon Kim et al.

ECCV 2024arXiv:2312.07315
10
citations

OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers

Han Liang, Jiacheng Bao, Ruichi Zhang et al.

CVPR 2024arXiv:2312.08985
48
citations

OneRestore: A Universal Restoration Framework for Composite Degradation

Yu Guo, Yuan Gao, Yuxu Lu et al.

ECCV 2024arXiv:2407.04621
71
citations

Point-supervised Panoptic Segmentation via Estimating Pseudo Labels from Learnable Distance

Jing Li, Junsong Fan, Zhaoxiang Zhang

ECCV 2024
2
citations

SemReg: Semantics Constrained Point Cloud Registration

Sheldon Fung, Xuequan Lu, Dasith de Silva Edirimuni et al.

ECCV 2024
7
citations

Text-Anchored Score Composition: Tackling Condition Misalignment in Text-to-Image Diffusion Models

Luozhou Wang, Guibao Shen, Wenhang Ge et al.

ECCV 2024arXiv:2306.14408
5
citations