Poster "transformer architectures" Papers
25 papers found
Conference
Adaptive Rank Allocation: Speeding Up Modern Transformers with RaNA Adapters
Roberto Garcia, Jerry Liu, Daniel Sorvisto et al.
Architectural and Inferential Inductive Biases for Exchangeable Sequence Modeling
Daksh Mittal, Leon Li, Thomson Yen et al.
ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding
Guangda Ji, Silvan Weder, Francis Engelmann et al.
Attention on the Sphere
Boris Bonev, Max Rietmann, Andrea Paris et al.
DiC: Rethinking Conv3x3 Designs in Diffusion Models
Yuchuan Tian, Jing Han, Chengcheng Wang et al.
Disentangling Representations through Multi-task Learning
Pantelis Vafidis, Aman Bhargava, Antonio Rangel
Do ImageNet-trained Models Learn Shortcuts? The Impact of Frequency Shortcuts on Generalization
Shunxin Wang, Raymond Veldhuis, Nicola Strisciuglio
EUGens: Efficient, Unified and General Dense Layers
Sang Min Kim, Byeongchan Kim, Arijit Sehanobish et al.
Learning in Compact Spaces with Approximately Normalized Transformer
Jörg Franke, Urs Spiegelhalter, Marianna Nezhurina et al.
Let Me Grok for You: Accelerating Grokking via Embedding Transfer from a Weaker Model
Zhiwei Xu, Zhiyu Ni, Yixin Wang et al.
Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory
Nikola Zubic, Federico Soldà, Aurelio Sulser et al.
L-SWAG: Layer-Sample Wise Activation with Gradients Information for Zero-Shot NAS on Vision Transformers
Sofia Casarin, Sergio Escalera, Oswald Lanz
Optimal Brain Apoptosis
Mingyuan Sun, Zheng Fang, Jiaxu Wang et al.
Q3R: Quadratic Reweighted Rank Regularizer for Effective Low-Rank Training
Ipsita Ghosh, Ethan Nguyen, Christian Kümmerle
Streamlining Prediction in Bayesian Deep Learning
Rui Li, Marcus Klasson, Arno Solin et al.
TabM: Advancing tabular deep learning with parameter-efficient ensembling
Yury Gorishniy, Akim Kotelnikov, Artem Babenko
Unsupervised Meta-Learning via In-Context Learning
Anna Vettoruzzo, Lorenzo Braccaioli, Joaquin Vanschoren et al.
All-in-one simulation-based inference
Manuel Gloeckler, Michael Deistler, Christian Weilbach et al.
Controllable Prompt Tuning For Balancing Group Distributional Robustness
Hoang Phan, Andrew Wilson, Qi Lei
Improving Token-Based World Models with Parallel Observation Prediction
Lior Cohen, Kaixin Wang, Bingyi Kang et al.
Loss Shaping Constraints for Long-Term Time Series Forecasting
Ignacio Hounie, Javier Porras-Valenzuela, Alejandro Ribeiro
Operational Open-Set Recognition and PostMax Refinement
Steve Cruz, Ryan Rabinowitz, Manuel Günther et al.
Outlier-aware Slicing for Post-Training Quantization in Vision Transformer
Yuexiao Ma, Huixia Li, Xiawu Zheng et al.
Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
Bingxin Ke, Anton Obukhov, Shengyu Huang et al.
Towards Interpretable Deep Local Learning with Successive Gradient Reconciliation
Yibo Yang, Xiaojie Li, Motasem Alfarra et al.