Poster "transformer architecture" Papers
257 papers found • Page 1 of 6
Conference
Accelerating Training with Neuron Interaction and Nowcasting Networks
Boris Knyazev, Abhinav Moudgil, Guillaume Lajoie et al.
Adaptive Transformer Programs: Bridging the Gap Between Performance and Interpretability in Transformers
Quoc-Vinh Lai-Dang, Taemin Kang, Seungah Son
Aligning Moments in Time using Video Queries
Yogesh Kumar, Uday Agarwal, Manish Gupta et al.
AniMer: Animal Pose and Shape Estimation Using Family Aware Transformer
Jin Lyu, Tianyi Zhu, Yi Gu et al.
An OpenMind for 3D Medical Vision Self-supervised Learning
Tassilo Wald, Constantin Ulrich, Jonathan Suprijadi et al.
Ask, and it shall be given: On the Turing completeness of prompting
Ruizhong Qiu, Zhe Xu, Wenxuan Bao et al.
A Solvable Attention for Neural Scaling Laws
Bochen Lyu, Di Wang, Zhanxing Zhu
Attention as a Hypernetwork
Simon Schug, Seijin Kobayashi, Yassir Akram et al.
Attention-based clustering
Rodrigo Maulen Soto, Pierre Marion, Claire Boyer
BlockScan: Detecting Anomalies in Blockchain Transactions
Jiahao Yu, Xian Wu, Hao Liu et al.
Boltzmann Attention Sampling for Image Analysis with Small Objects
Theodore Zhao, Sid Kiblawi, Mu Wei et al.
Born a Transformer -- Always a Transformer? On the Effect of Pretraining on Architectural Abilities
Mayank Jobanputra, Yana Veitsman, Yash Sarrof et al.
Can Transformers Do Enumerative Geometry?
Baran Hashemi, Roderic Corominas, Alessandro Giacchetto
CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction
Zhefei Gong, Pengxiang Ding, Shangke Lyu et al.
CATANet: Efficient Content-Aware Token Aggregation for Lightweight Image Super-Resolution
Xin Liu, Jie Liu, Jie Tang et al.
Chaos Meets Attention: Transformers for Large-Scale Dynamical Prediction
Yi He, Yiming Yang, Xiaoyuan Cheng et al.
CO-MOT: Boosting End-to-end Transformer-based Multi-Object Tracking via Coopetition Label Assignment and Shadow Sets
feng yan, Weixin Luo, Yujie Zhong et al.
Compositional Reasoning with Transformers, RNNs, and Chain of Thought
Gilad Yehudai, Noah Amsel, Joan Bruna
ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices
Hao Yu, Tangyu Jiang, Shuning Jia et al.
Context-Aware Regularization with Markovian Integration for Attention-Based Nucleotide Analysis
Mohammad Saleh Refahi, Mahdi Abavisani, Bahrad Sokhansanj et al.
Context-Enhanced Memory-Refined Transformer for Online Action Detection
Zhanzhong Pang, Fadime Sener, Angela Yao
Continual Multiple Instance Learning with Enhanced Localization for Histopathological Whole Slide Image Analysis
Byung Hyun Lee, Wongi Jeong, Woojae Han et al.
Continuity and Isolation Lead to Doubts or Dilemmas in Large Language Models
Hector Pasten, Felipe Urrutia, Hector Orellana et al.
CoTFormer: A Chain of Thought Driven Architecture with Budget-Adaptive Computation Cost at Inference
Amirkeivan Mohtashami, Matteo Pagliardini, Martin Jaggi
CrossSpectra: Exploiting Cross-Layer Smoothness for Parameter-Efficient Fine-Tuning
Yifei Zhang, Hao Zhu, Junhao Dong et al.
DeltaFormer: Unlock the state space of Transformer
Mingyu Xu, Tenglong Ao, Jiaao He et al.
Devil is in the Uniformity: Exploring Diverse Learners within Transformer for Image Restoration
Shihao Zhou, Dayu Li, Jinshan Pan et al.
DiffE2E: Rethinking End-to-End Driving with a Hybrid Diffusion-Regression-Classification Policy
Rui Zhao, Yuze Fan, Ziguo Chen et al.
Differential Transformer
Tianzhu Ye, Li Dong, Yuqing Xia et al.
DiffLocks: Generating 3D Hair from a Single Image using Diffusion Models
Radu Alexandru Rosu, Keyu Wu, Yao Feng et al.
DIFFSSR: Stereo Image Super-resolution Using Differential Transformer
Dafeng Zhang
Diffusion-Based Planning for Autonomous Driving with Flexible Guidance
Yinan Zheng, Ruiming Liang, Kexin ZHENG et al.
Dimension Agnostic Neural Processes
Hyungi Lee, Chaeyun Jang, Dong Bok Lee et al.
Dinomaly: The Less Is More Philosophy in Multi-Class Unsupervised Anomaly Detection
Jia Guo, Shuai Lu, Weihang Zhang et al.
Distributional Associations vs In-Context Reasoning: A Study of Feed-forward and Attention Layers
Lei Chen, Joan Bruna, Alberto Bietti
Dynamic Diffusion Transformer
Wangbo Zhao, Yizeng Han, Jiasheng Tang et al.
Dynamic Semantic-Aware Correlation Modeling for UAV Tracking
Xinyu Zhou, Tongxin Pan, Lingyi Hong et al.
Easi3R: Estimating Disentangled Motion from DUSt3R Without Training
Xingyu Chen, Yue Chen, Yuliang Xiu et al.
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation
Zihao Zhang, Haoran Chen, Haoyu Zhao et al.
Efficient Concertormer for Image Deblurring and Beyond
Pin-Hung Kuo, Jinshan Pan, Shao-Yi Chien et al.
Efficient Time Series Processing for Transformers and State-Space Models through Token Merging
Leon Götz, Marcel Kollovieh, Stephan Günnemann et al.
Emergence of Linear Truth Encodings in Language Models
Shauli Ravfogel, Gilad Yehudai, Tal Linzen et al.
End-to-End Implicit Neural Representations for Classification
Alexander Gielisse, Jan van Gemert
Enhancing 3D Gaze Estimation in the Wild using Weak Supervision with Gaze Following Labels
Pierre Vuillecard, Jean-marc Odobez
Enhancing Image Restoration Transformer via Adaptive Translation Equivariance
JiaKui Hu, Zhengjian Yao, Lujia Jin et al.
Enhancing Transformers Through Conditioned Embedded Tokens
Hemanth Saratchandran, Simon Lucey
ESCAPE: Equivariant Shape Completion via Anchor Point Encoding
Burak Bekci, Nassir Navab, Federico Tombari et al.
Evolutionary Reasoning Does Not Arise in Standard Usage of Protein Language Models
Yasha Ektefaie, Andrew Shen, Lavik Jain et al.
Exact Expressive Power of Transformers with Padding
Will Merrill, Ashish Sabharwal
FaceXFormer: A Unified Transformer for Facial Analysis
Kartik Narayan, Vibashan VS, Rama Chellappa et al.