"cross-attention mechanism" Papers
38 papers found
Conference
A Conditional Probability Framework for Compositional Zero-shot Learning
Peng Wu, Qiuxia Lai, Hao Fang et al.
BrainMoE: Cognition Joint Embedding via Mixture-of-Expert Towards Robust Brain Foundation Model
Ziquan Wei, Tingting Dan, Tianlong Chen et al.
Causal Disentanglement and Cross-Modal Alignment for Enhanced Few-Shot Learning
Tianjiao Jiang, Zhen Zhang, Yuhang Liu et al.
CLOT: Closed Loop Optimal Transport for Unsupervised Action Segmentation
Elena Bueno-Benito, Mariella Dimiccoli
Concept Replacer: Replacing Sensitive Concepts in Diffusion Models via Precision Localization
lingyun zhang, Yu Xie, Yanwei Fu et al.
ConceptSplit: Decoupled Multi-Concept Personalization of Diffusion Models via Token-wise Adaptation and Attention Disentanglement
Habin Lim, Youngseob Won, Juwon Seo et al.
CrossMPT: Cross-attention Message-passing Transformer for Error Correcting Codes
Seong-Joon Park, Hee-Youl Kwak, Sang-Hyo Kim et al.
CViT: Continuous Vision Transformer for Operator Learning
Sifan Wang, Jacob Seidman, Shyam Sankaran et al.
DiffE2E: Rethinking End-to-End Driving with a Hybrid Diffusion-Regression-Classification Policy
Rui Zhao, Yuze Fan, Ziguo Chen et al.
FedMVP: Federated Multimodal Visual Prompt Tuning for Vision-Language Models
Mainak Singha, Subhankar Roy, Sarthak Mehrotra et al.
Future Link Prediction Without Memory or Aggregation
Lu Yi, Runlin Lei, Fengran Mo et al.
HiLoTs: High-Low Temporal Sensitive Representation Learning for Semi-Supervised LiDAR Segmentation in Autonomous Driving
R.D. Lin, Pengcheng Weng, Yinqiao Wang et al.
Image Referenced Sketch Colorization Based on Animation Creation Workflow
Dingkun Yan, Xinrui Wang, Zhuoru Li et al.
InstantSwap: Fast Customized Concept Swapping across Sharp Shape Differences
Chenyang Zhu, Kai Li, Yue Ma et al.
Inverse Image-Based Rendering for Light Field Generation from Single Images
Hyunjun Jung, Hae-Gon Jeon
LLaFEA: Frame-Event Complementary Fusion for Fine-Grained Spatiotemporal Understanding in LMMs
Hanyu Zhou, Gim Hee Lee
LLaVA-SP: Enhancing Visual Representation with Visual Spatial Tokens for MLLMs
Haoran Lou, Chunxiao Fan, Ziyan Liu et al.
MoFRR: Mixture of Diffusion Models for Face Retouching Restoration
Jiaxin Liu, Qichao Ying, Zhenxing Qian et al.
PhySense: Sensor Placement Optimization for Accurate Physics Sensing
Yuezhou Ma, Haixu Wu, Hang Zhou et al.
Progressive Compositionality in Text-to-Image Generative Models
Xu Han, Linghao Jin, Xiaofeng Liu et al.
RadZero: Similarity-Based Cross-Attention for Explainable Vision-Language Alignment in Chest X-ray with Zero-Shot Multi-Task Capability
Jonggwon Park, Byungmu Yoon, Soobum Kim et al.
RC-AutoCalib: An End-to-End Radar-Camera Automatic Calibration Network
Van-Tin Luu, Yong-Lin Cai, Vu-Hoang Tran et al.
ReCon: Region-Controllable Data Augmentation with Rectification and Alignment for Object Detection
Haowei Zhu, Tianxiang Pan, Rui Qin et al.
Towards Single-Source Domain Generalized Object Detection via Causal Visual Prompts
Chen Li, Huiying Xu, Changxin Gao et al.
Translation of Text Embedding via Delta Vector to Suppress Strongly Entangled Content in Text-to-Image Diffusion Models
Eunseo Koh, SeungHoo Hong, Tae-Young Kim et al.
UNIC-Adapter: Unified Image-instruction Adapter with Multi-modal Transformer for Image Generation
Lunhao Duan, Shanshan Zhao, Wenjun Yan et al.
What to Preserve and What to Transfer: Faithful, Identity-Preserving Diffusion-based Hairstyle Transfer
Chaeyeon Chung, Sunghyun Park, Jeongho Kim et al.
X-NeMo: Expressive Neural Motion Reenactment via Disentangled Latent Attention
XiaoChen Zhao, Hongyi Xu, Guoxian Song et al.
Elegantly Written: Disentangling Writer and Character Styles for Enhancing Online Chinese Handwriting
Yu Liu, Fatimah binti Khalid, Lei Wang et al.
Image Fusion via Vision-Language Model
Zixiang Zhao, Lilun Deng, Haowen Bai et al.
Meta Evidential Transformer for Few-Shot Open-Set Recognition
Hitesh Sapkota, Krishna Neupane, Qi Yu
NVS-Adapter: Plug-and-Play Novel View Synthesis from a Single Image
Yoonwoo Jeong, Jinwoo Lee, Chiheon Kim et al.
OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers
Han Liang, Jiacheng Bao, Ruichi Zhang et al.
OneRestore: A Universal Restoration Framework for Composite Degradation
Yu Guo, Yuan Gao, Yuxu Lu et al.
Point-supervised Panoptic Segmentation via Estimating Pseudo Labels from Learnable Distance
Jing Li, Junsong Fan, Zhaoxiang Zhang
SemReg: Semantics Constrained Point Cloud Registration
Sheldon Fung, Xuequan Lu, Dasith de Silva Edirimuni et al.
Style Injection in Diffusion: A Training-free Approach for Adapting Large-scale Diffusion Models for Style Transfer
Jiwoo Chung, Sangeek Hyun, Jae-Pil Heo
Text-Anchored Score Composition: Tackling Condition Misalignment in Text-to-Image Diffusion Models
Luozhou Wang, Guibao Shen, Wenhang Ge et al.