"transformer architecture" Papers
335 papers found • Page 5 of 7
Conference
What Makes a Good Diffusion Planner for Decision Making?
Haofei Lu, Dongqi Han, Yifei Shen et al.
Why In-Context Learning Models are Good Few-Shot Learners?
Shiguang Wu, Yaqing Wang, Quanming Yao
ZETA: Leveraging $Z$-order Curves for Efficient Top-$k$ Attention
Qiuhao Zeng, Jierui Huang, Peng Lu et al.
A Comparative Study of Image Restoration Networks for General Backbone Network Design
Xiangyu Chen, Zheyuan Li, Yuandong Pu et al.
Active Coarse-to-Fine Segmentation of Moveable Parts from Real Images
Ruiqi Wang, Akshay Gadi Patil, Fenggen Yu et al.
ALERT-Transformer: Bridging Asynchronous and Synchronous Machine Learning for Real-Time Event-based Spatio-Temporal Data
Carmen Martin-Turrero, Maxence Bouvier, Manuel Breitenstein et al.
An Incremental Unified Framework for Small Defect Inspection
Jiaqi Tang, Hao Lu, Xiaogang Xu et al.
A Simple Low-bit Quantization Framework for Video Snapshot Compressive Imaging
Miao Cao, Lishun Wang, Huan Wang et al.
A Tale of Tails: Model Collapse as a Change of Scaling Laws
Elvis Dohmatob, Yunzhen Feng, Pu Yang et al.
Attention Disturbance and Dual-Path Constraint Network for Occluded Person Re-identification
Jiaer Xia, Lei Tan, Pingyang Dai et al.
Attention Meets Post-hoc Interpretability: A Mathematical Perspective
Gianluigi Lopardo, Frederic Precioso, Damien Garreau
Auctionformer: A Unified Deep Learning Algorithm for Solving Equilibrium Strategies in Auction Games
Kexin Huang, Ziqian Chen, xue wang et al.
AVSegFormer: Audio-Visual Segmentation with Transformer
Shengyi Gao, Zhe Chen, Guo Chen et al.
Bidirectional Multi-Scale Implicit Neural Representations for Image Deraining
Xiang Chen, Jinshan Pan, Jiangxin Dong
Breaking through the learning plateaus of in-context learning in Transformer
Jingwen Fu, Tao Yang, Yuwang Wang et al.
Bridging the Gap between 2D and 3D Visual Question Answering: A Fusion Approach for 3D VQA
Wentao Mo, Yang Liu
CarFormer: Self-Driving with Learned Object-Centric Representations
Shadi Hamdan, Fatma Guney
CityGuessr: City-Level Video Geo-Localization on a Global Scale
Parth Parag Kulkarni, Gaurav Kumar Nayak, Shah Mubarak
Converting Transformers to Polynomial Form for Secure Inference Over Homomorphic Encryption
Itamar Zimerman, Moran Baruch, Nir Drucker et al.
Correlation Matching Transformation Transformers for UHD Image Restoration
Cong Wang, Jinshan Pan, Wei Wang et al.
DiffSurf: A Transformer-based Diffusion Model for Generating and Reconstructing 3D Surfaces in Pose
Yoshiyasu Yusuke, Leyuan Sun
Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control
Zheng Xiong, Risto Vuorio, Jacob Beck et al.
Dolfin: Diffusion Layout Transformers without Autoencoder
Yilin Wang, Zeyuan Chen, Liangjun Zhong et al.
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference
Piotr Nawrot, Adrian Łańcucki, Marcin Chochowski et al.
EcoMatcher: Efficient Clustering Oriented Matcher for Detector-free Image Matching
Peiqi Chen, Lei Yu, Yi Wan et al.
EDformer: Transformer-Based Event Denoising Across Varied Noise Levels
Bin Jiang, Bo Xiong, Bohan Qu et al.
Effective Lymph Nodes Detection in CT Scans Using Location Debiased Query Selection and Contrastive Query Representation in Transformer
Qinji Yu, Yirui Wang, Ke Yan et al.
Efficient Pre-training for Localized Instruction Generation of Procedural Videos
Anil Batra, Davide Moltisanti, Laura Sevilla-Lara et al.
EFormer: Enhanced Transformer towards Semantic-Contour Features of Foreground for Portraits Matting
Zitao Wang, Qiguang Miao, Yue Xi et al.
Event Stream-based Visual Object Tracking: A High-Resolution Benchmark Dataset and A Novel Baseline
Xiao Wang, Shiao Wang, Chuanming Tang et al.
Exploring Transformer Extrapolation
Zhen Qin, Yiran Zhong, Hui Deng
FAR: Flexible Accurate and Robust 6DoF Relative Camera Pose Estimation
Chris Rockwell, Nilesh Kulkarni, Linyi Jin et al.
Fast Encoding and Decoding for Implicit Video Representation
Hao Chen, Saining Xie, Ser-Nam Lim et al.
Fast Registration of Photorealistic Avatars for VR Facial Animation
Chaitanya Patel, Shaojie Bai, Te-Li Wang et al.
Gated Linear Attention Transformers with Hardware-Efficient Training
Songlin Yang, Bailin Wang, Yikang Shen et al.
GeoMFormer: A General Architecture for Geometric Molecular Representation Learning
Tianlang Chen, Shengjie Luo, Di He et al.
GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding
Hao Li, Dingwen Zhang, Yalun Dai et al.
GPSFormer: A Global Perception and Local Structure Fitting-based Transformer for Point Cloud Understanding
Changshuo Wang, Meiqing Wu, Siew-Kei Lam et al.
Graph External Attention Enhanced Transformer
Jianqing Liang, Min Chen, Jiye Liang
Graph Generation with $K^2$-trees
Yunhui Jang, Dongwoo Kim, Sungsoo Ahn
GridFormer: Point-Grid Transformer for Surface Reconstruction
Shengtao Li, Ge Gao, Yudong Liu et al.
GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation
Yinghao Xu, Zifan Shi, Wang Yifan et al.
Grounding Image Matching in 3D with MASt3R
Vincent Leroy, Yohann Cabon, Jerome Revaud
GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting
Kai Zhang, Sai Bi, Hao Tan et al.
HDformer: A Higher
Dimensional Transformer for Detecting Diabetes Utilizing Long-Range Vascular Signals - Ella Lan
How do Transformers Perform In-Context Autoregressive Learning ?
Michael Sander, Raja Giryes, Taiji Suzuki et al.
How Smooth Is Attention?
Valérie Castin, Pierre Ablin, Gabriel Peyré
How to Protect Copyright Data in Optimization of Large Language Models?
Timothy Chu, Zhao Song, Chiwun Yang
How Transformers Learn Causal Structure with Gradient Descent
Eshaan Nichani, Alex Damian, Jason Lee
HybridGait: A Benchmark for Spatial-Temporal Cloth-Changing Gait Recognition with Hybrid Explorations
Yilan Dong, Chunlin Yu, Ruiyang Ha et al.