"transformer architecture" Papers
335 papers found • Page 3 of 7
Conference
Meta-Learning an In-Context Transformer Model of Human Higher Visual Cortex
Muquan Yu, Mu Nan, Hossein Adeli et al.
MI-DETR: An Object Detection Model with Multi-time Inquiries Mechanism
Zhixiong Nan, Xianghong Li, Tao Xiang et al.
MIDI-GPT: A Controllable Generative Model for Computer-Assisted Multitrack Music Composition
Philippe Pasquier, Jeff Ens, Nathan Fradet et al.
Mimic In-Context Learning for Multimodal Tasks
Yuchu Jiang, Jiale Fu, chenduo hao et al.
MIND over Body: Adaptive Thinking using Dynamic Computation
Mrinal Mathur, Barak Pearlmutter, Sergey Plis
MoPFormer: Motion-Primitive Transformer for Wearable-Sensor Activity Recognition
Hao Zhang, Zhan Zhuang, Xuehao Wang et al.
MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm
Ziyan Guo, Zeyu HU, Na Zhao et al.
Multifaceted User Modeling in Recommendation: A Federated Foundation Models Approach
Chunxu Zhang, Guodong Long, Hongkuan Guo et al.
MV-SSM: Multi-View State Space Modeling for 3D Human Pose Estimation
Aviral Chharia, Wenbo Gou, Haoye Dong
Neural Collapse is Globally Optimal in Deep Regularized ResNets and Transformers
Peter Súkeník, Christoph Lampert, Marco Mondelli
NN-Former: Rethinking Graph Structure in Neural Architecture Representation
Ruihan Xu, Haokui Zhang, Yaowei Wang et al.
nnWNet: Rethinking the Use of Transformers in Biomedical Image Segmentation and Calling for a Unified Evaluation Benchmark
Yanfeng Zhou, Lingrui Li, Le Lu et al.
Non-Markovian Discrete Diffusion with Causal Language Models
Yangtian Zhang, Sizhuang He, Daniel Levine et al.
Normalization in Attention Dynamics
Nikita Karagodin, Shu Ge, Yury Polyanskiy et al.
Numerical Pruning for Efficient Autoregressive Models
Xuan Shen, Zhao Song, Yufa Zhou et al.
One-Minute Video Generation with Test-Time Training
Jiarui Xu, Shihao Han, Karan Dalal et al.
On the Learn-to-Optimize Capabilities of Transformers in In-Context Sparse Recovery
Renpu Liu, Ruida Zhou, Cong Shen et al.
On the Optimization and Generalization of Multi-head Attention
Christos Thrampoulidis, Rouzbeh Ghaderi, Hossein Taheri et al.
On the Role of Hidden States of Modern Hopfield Network in Transformer
Optimal Control for Transformer Architectures: Enhancing Generalization, Robustness and Efficiency
Kelvin Kan, Xingjian Li, Benjamin Zhang et al.
Optimal Dynamic Regret by Transformers for Non-Stationary Reinforcement Learning
Baiyuan Chen, Shinji Ito, Masaaki Imaizumi
OVTR: End-to-End Open-Vocabulary Multiple Object Tracking with Transformer
Jinyang Li, En Yu, Sijia Chen et al.
Point-MaDi: Masked Autoencoding with Diffusion for Point Cloud Pre-training
Xiaoyang Xiao, Runzhao Yao, Zhiqiang Tian et al.
Point-SAM: Promptable 3D Segmentation Model for Point Clouds
Yuchen Zhou, Jiayuan Gu, Tung Chiang et al.
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models
Zhijian Zhuo, Ya Wang, Yutao Zeng et al.
Pow3R: Empowering Unconstrained 3D Reconstruction with Camera and Scene Priors
Wonbong Jang, Philippe Weinzaepfel, Vincent Leroy et al.
Prediction-Feedback DETR for Temporal Action Detection
Jihwan Kim, Miso Lee, Cheol-Ho Cho et al.
Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation
Yang Tian, Sizhe Yang, Jia Zeng et al.
Principles of Visual Tokens for Efficient Video Understanding
Xinyue Hao, Li, Shreyank Gowda et al.
Prompt-SID: Learning Structural Representation Prompt via Latent Diffusion for Single Image Denoising
Huaqiu Li, Wang Zhang, Xiaowan Hu et al.
PS3: A Multimodal Transformer Integrating Pathology Reports with Histology Images and Biological Pathways for Cancer Survival Prediction
Manahil Raza, Ayesha Azam, Talha Qaiser et al.
Quantum Doubly Stochastic Transformers
Jannis Born, Filip Skogh, Kahn Rhrissorrakrai et al.
RAPTR: Radar-based 3D Pose Estimation using Transformer
Sorachi Kato, Ryoma Yataka, Pu Wang et al.
RaySt3R: Predicting Novel Depth Maps for Zero-Shot Object Completion
Bardienus Duisterhof, Jan Oberst, Bowen Wen et al.
RCTrans: Radar-Camera Transformer via Radar Densifier and Sequential Decoder for 3D Object Detection
Yiheng Li, Yang Yang, Zhen Lei
Real-Time Calibration Model for Low-Cost Sensor in Fine-Grained Time Series
Seokho Ahn, Hyungjin Kim, Sungbok Shin et al.
Reconstructing Humans with a Biomechanically Accurate Skeleton
Yan Xia, Xiaowei Zhou, Etienne Vouga et al.
Rectifying Magnitude Neglect in Linear Attention
Qihang Fan, Huaibo Huang, Yuang Ai et al.
Recurrent Self-Attention Dynamics: An Energy-Agnostic Perspective from Jacobians
Akiyoshi Tomihari, Ryo Karakida
Replacing Paths with Connection-Biased Attention for Knowledge Graph Completion
Sharmishtha Dutta, Alex Gittens, Mohammed J. Zaki et al.
Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Think
Ge Wu, Shen Zhang, Ruijing Shi et al.
Revisiting Convolution Architecture in the Realm of DNA Foundation Models
Yu Bo, Weian Mao, Daniel Shao et al.
REWIND: Real-Time Egocentric Whole-Body Motion Diffusion with Exemplar-Based Identity Conditioning
Jihyun Lee, Weipeng Xu, Alexander Richard et al.
RI-MAE: Rotation-Invariant Masked AutoEncoders for Self-Supervised Point Cloud Representation Learning
Kunming Su, Qiuxia Wu, Panpan Cai et al.
RNNs are not Transformers (Yet): The Key Bottleneck on In-Context Retrieval
Kaiyue Wen, Xingyu Dang, Kaifeng Lyu
Rope to Nope and Back Again: A New Hybrid Attention Strategy
Bowen Yang, Bharat Venkitesh, Dwaraknath Gnaneshwar Talupuru et al.
SAM 2: Segment Anything in Images and Videos
Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu et al.
SAS: Simulated Attention Score
Chuanyang Zheng, Jiankai Sun, Yihang Gao et al.
Scaling Transformers for Low-Bitrate High-Quality Speech Coding
Julian Parker, Anton Smirnov, Jordi Pons et al.
SCENT: Robust Spatiotemporal Learning for Continuous Scientific Data via Scalable Conditioned Neural Fields
David K Park, Xihaier Luo, Guang Zhao et al.