"transformer-based models" Papers

16 papers found

Filters:transformer-based models Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

AI Progress Should Be Measured by Capability-Per-Resource, Not Scale Alone: A Framework for Gradient-Guided Resource Allocation in LLMs

David McCoy, Yulun Wu, Zachary Butzin-Dozier

NEURIPS 2025arXiv:2511.01077

Enhancing the Maximum Effective Window for Long-Term Time Series Forecasting

Jiahui Zhang, Zhengyang Zhou, Wenjie Du et al.

NEURIPS 2025

From Promise to Practice: Realizing High-performance Decentralized Training

Zesen Wang, Jiaojiao Zhang, Xuyang Wu et al.

ICLR 2025arXiv:2410.11998

citations

Infer Human’s Intentions Before Following Natural Language Instructions

Yanming Wan, Yue Wu, Yiping Wang et al.

AAAI 2025paperarXiv:2409.18073

citations

Retrieval Head Mechanistically Explains Long-Context Factuality

Wenhao Wu, Yizhong Wang, Guangxuan Xiao et al.

ICLR 2025arXiv:2404.15574

150

citations

SimpleTM: A Simple Baseline for Multivariate Time Series Forecasting

Hui Chen, Viet Luong, Lopamudra Mukherjee et al.

ICLR 2025oral

citations

TB-HSU: Hierarchical 3D Scene Understanding with Contextual Affordances

Wenting Xu, Viorela Ila, Luping Zhou et al.

AAAI 2025paperarXiv:2412.05596

citations

UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming

Hao Lin, Ke Wu, Jie Li et al.

CVPR 2025arXiv:2307.16375

citations

xPatch: Dual-Stream Time Series Forecasting with Exponential Seasonal-Trend Decomposition

Artyom Stitsyuk, Jaesik Choi

AAAI 2025paperarXiv:2412.17323

citations

An Attentive Inductive Bias for Sequential Recommendation beyond the Self-Attention

Yehjin Shin, Jeongwhan Choi, Hyowon Wi et al.

AAAI 2024paperarXiv:2312.10325

104

citations

EgoPoseFormer: A Simple Baseline for Stereo Egocentric 3D Human Pose Estimation

Chenhongyi Yang, Anastasia Tkach, Shreyas Hampali et al.

ECCV 2024arXiv:2403.18080

citations

FedRA: A Random Allocation Strategy for Federated Tuning to Unleash the Power of Heterogeneous Clients

Shangchao Su, Bin Li, Xiangyang Xue

ECCV 2024arXiv:2311.11227

citations

Harnessing Joint Rain-/Detail-aware Representations to Eliminate Intricate Rains

Wu Ran, Peirong Ma, Zhiquan He et al.

ICLR 2024arXiv:2404.12091

citations

IM-Unpack: Training and Inference with Arbitrarily Low Precision Integers

Zhanpeng Zeng, Karthikeyan Sankaralingam, Vikas Singh

ICML 2024arXiv:2403.07339

citations

MERGE: Fast Private Text Generation

Zi Liang, Pinghui Wang, Ruofei Zhang et al.

AAAI 2024paperarXiv:2305.15769

citations

Outlier-Efficient Hopfield Layers for Large Transformer-Based Models

Jerry Yao-Chieh Hu, Pei-Hsuan Chang, Haozheng Luo et al.

ICML 2024arXiv:2404.03828

citations