"cross-attention mechanism" Papers

38 papers found

A Conditional Probability Framework for Compositional Zero-shot Learning

Peng Wu, Qiuxia Lai, Hao Fang et al.

ICCV 2025arXiv:2507.17377
1
citations

BrainMoE: Cognition Joint Embedding via Mixture-of-Expert Towards Robust Brain Foundation Model

Ziquan Wei, Tingting Dan, Tianlong Chen et al.

NEURIPS 2025oral

Causal Disentanglement and Cross-Modal Alignment for Enhanced Few-Shot Learning

Tianjiao Jiang, Zhen Zhang, Yuhang Liu et al.

ICCV 2025arXiv:2508.03102
1
citations

CLOT: Closed Loop Optimal Transport for Unsupervised Action Segmentation

Elena Bueno-Benito, Mariella Dimiccoli

ICCV 2025arXiv:2507.03539

Concept Replacer: Replacing Sensitive Concepts in Diffusion Models via Precision Localization

lingyun zhang, Yu Xie, Yanwei Fu et al.

CVPR 2025arXiv:2412.01244
5
citations

ConceptSplit: Decoupled Multi-Concept Personalization of Diffusion Models via Token-wise Adaptation and Attention Disentanglement

Habin Lim, Youngseob Won, Juwon Seo et al.

ICCV 2025arXiv:2510.04668

CrossMPT: Cross-attention Message-passing Transformer for Error Correcting Codes

Seong-Joon Park, Hee-Youl Kwak, Sang-Hyo Kim et al.

ICLR 2025arXiv:2405.01033
20
citations

CViT: Continuous Vision Transformer for Operator Learning

Sifan Wang, Jacob Seidman, Shyam Sankaran et al.

ICLR 2025oralarXiv:2405.13998
31
citations

DiffE2E: Rethinking End-to-End Driving with a Hybrid Diffusion-Regression-Classification Policy

Rui Zhao, Yuze Fan, Ziguo Chen et al.

NEURIPS 2025

FedMVP: Federated Multimodal Visual Prompt Tuning for Vision-Language Models

Mainak Singha, Subhankar Roy, Sarthak Mehrotra et al.

ICCV 2025arXiv:2504.20860
2
citations

Future Link Prediction Without Memory or Aggregation

Lu Yi, Runlin Lei, Fengran Mo et al.

NEURIPS 2025oralarXiv:2505.19408
1
citations

HiLoTs: High-Low Temporal Sensitive Representation Learning for Semi-Supervised LiDAR Segmentation in Autonomous Driving

R.D. Lin, Pengcheng Weng, Yinqiao Wang et al.

CVPR 2025arXiv:2503.17752
1
citations

Image Referenced Sketch Colorization Based on Animation Creation Workflow

Dingkun Yan, Xinrui Wang, Zhuoru Li et al.

CVPR 2025arXiv:2502.19937
3
citations

InstantSwap: Fast Customized Concept Swapping across Sharp Shape Differences

Chenyang Zhu, Kai Li, Yue Ma et al.

ICLR 2025arXiv:2412.01197
30
citations

Inverse Image-Based Rendering for Light Field Generation from Single Images

Hyunjun Jung, Hae-Gon Jeon

ICCV 2025highlightarXiv:2510.20132

LLaFEA: Frame-Event Complementary Fusion for Fine-Grained Spatiotemporal Understanding in LMMs

Hanyu Zhou, Gim Hee Lee

ICCV 2025arXiv:2503.06934
3
citations

LLaVA-SP: Enhancing Visual Representation with Visual Spatial Tokens for MLLMs

Haoran Lou, Chunxiao Fan, Ziyan Liu et al.

ICCV 2025arXiv:2507.00505

MoFRR: Mixture of Diffusion Models for Face Retouching Restoration

Jiaxin Liu, Qichao Ying, Zhenxing Qian et al.

ICCV 2025arXiv:2507.19770

PhySense: Sensor Placement Optimization for Accurate Physics Sensing

Yuezhou Ma, Haixu Wu, Hang Zhou et al.

NEURIPS 2025oralarXiv:2505.18190
2
citations

Progressive Compositionality in Text-to-Image Generative Models

Xu Han, Linghao Jin, Xiaofeng Liu et al.

ICLR 2025arXiv:2410.16719
9
citations

RadZero: Similarity-Based Cross-Attention for Explainable Vision-Language Alignment in Chest X-ray with Zero-Shot Multi-Task Capability

Jonggwon Park, Byungmu Yoon, Soobum Kim et al.

NEURIPS 2025arXiv:2504.07416
1
citations

RC-AutoCalib: An End-to-End Radar-Camera Automatic Calibration Network

Van-Tin Luu, Yong-Lin Cai, Vu-Hoang Tran et al.

CVPR 2025arXiv:2505.22427
3
citations

ReCon: Region-Controllable Data Augmentation with Rectification and Alignment for Object Detection

Haowei Zhu, Tianxiang Pan, Rui Qin et al.

NEURIPS 2025spotlightarXiv:2510.15783
1
citations

Towards Single-Source Domain Generalized Object Detection via Causal Visual Prompts

Chen Li, Huiying Xu, Changxin Gao et al.

NEURIPS 2025arXiv:2510.19487

Translation of Text Embedding via Delta Vector to Suppress Strongly Entangled Content in Text-to-Image Diffusion Models

Eunseo Koh, SeungHoo Hong, Tae-Young Kim et al.

ICCV 2025arXiv:2508.10407

UNIC-Adapter: Unified Image-instruction Adapter with Multi-modal Transformer for Image Generation

Lunhao Duan, Shanshan Zhao, Wenjun Yan et al.

CVPR 2025arXiv:2412.18928
7
citations

What to Preserve and What to Transfer: Faithful, Identity-Preserving Diffusion-based Hairstyle Transfer

Chaeyeon Chung, Sunghyun Park, Jeongho Kim et al.

AAAI 2025paperarXiv:2408.16450
4
citations

X-NeMo: Expressive Neural Motion Reenactment via Disentangled Latent Attention

XiaoChen Zhao, Hongyi Xu, Guoxian Song et al.

ICLR 2025arXiv:2507.23143
20
citations

Elegantly Written: Disentangling Writer and Character Styles for Enhancing Online Chinese Handwriting

Yu Liu, Fatimah binti Khalid, Lei Wang et al.

ECCV 2024
7
citations

Image Fusion via Vision-Language Model

Zixiang Zhao, Lilun Deng, Haowen Bai et al.

ICML 2024arXiv:2402.02235
67
citations

Meta Evidential Transformer for Few-Shot Open-Set Recognition

Hitesh Sapkota, Krishna Neupane, Qi Yu

ICML 2024

NVS-Adapter: Plug-and-Play Novel View Synthesis from a Single Image

Yoonwoo Jeong, Jinwoo Lee, Chiheon Kim et al.

ECCV 2024arXiv:2312.07315
10
citations

OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers

Han Liang, Jiacheng Bao, Ruichi Zhang et al.

CVPR 2024arXiv:2312.08985
48
citations

OneRestore: A Universal Restoration Framework for Composite Degradation

Yu Guo, Yuan Gao, Yuxu Lu et al.

ECCV 2024arXiv:2407.04621
71
citations

Point-supervised Panoptic Segmentation via Estimating Pseudo Labels from Learnable Distance

Jing Li, Junsong Fan, Zhaoxiang Zhang

ECCV 2024
2
citations

SemReg: Semantics Constrained Point Cloud Registration

Sheldon Fung, Xuequan Lu, Dasith de Silva Edirimuni et al.

ECCV 2024
7
citations

Style Injection in Diffusion: A Training-free Approach for Adapting Large-scale Diffusion Models for Style Transfer

Jiwoo Chung, Sangeek Hyun, Jae-Pil Heo

CVPR 2024highlightarXiv:2312.09008
225
citations

Text-Anchored Score Composition: Tackling Condition Misalignment in Text-to-Image Diffusion Models

Luozhou Wang, Guibao Shen, Wenhang Ge et al.

ECCV 2024arXiv:2306.14408
5
citations