Poster "transformer-based models" Papers

10 papers found

AI Progress Should Be Measured by Capability-Per-Resource, Not Scale Alone: A Framework for Gradient-Guided Resource Allocation in LLMs

David McCoy, Yulun Wu, Zachary Butzin-Dozier

NEURIPS 2025arXiv:2511.01077

Enhancing the Maximum Effective Window for Long-Term Time Series Forecasting

Jiahui Zhang, Zhengyang Zhou, Wenjie Du et al.

NEURIPS 2025

From Promise to Practice: Realizing High-performance Decentralized Training

Zesen Wang, Jiaojiao Zhang, Xuyang Wu et al.

ICLR 2025arXiv:2410.11998
3
citations

Retrieval Head Mechanistically Explains Long-Context Factuality

Wenhao Wu, Yizhong Wang, Guangxuan Xiao et al.

ICLR 2025arXiv:2404.15574
150
citations

UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming

Hao Lin, Ke Wu, Jie Li et al.

CVPR 2025arXiv:2307.16375
4
citations

EgoPoseFormer: A Simple Baseline for Stereo Egocentric 3D Human Pose Estimation

Chenhongyi Yang, Anastasia Tkach, Shreyas Hampali et al.

ECCV 2024arXiv:2403.18080
5
citations

FedRA: A Random Allocation Strategy for Federated Tuning to Unleash the Power of Heterogeneous Clients

Shangchao Su, Bin Li, Xiangyang Xue

ECCV 2024arXiv:2311.11227
21
citations

Harnessing Joint Rain-/Detail-aware Representations to Eliminate Intricate Rains

Wu Ran, Peirong Ma, Zhiquan He et al.

ICLR 2024arXiv:2404.12091
4
citations

IM-Unpack: Training and Inference with Arbitrarily Low Precision Integers

Zhanpeng Zeng, Karthikeyan Sankaralingam, Vikas Singh

ICML 2024arXiv:2403.07339
1
citations

Outlier-Efficient Hopfield Layers for Large Transformer-Based Models

Jerry Yao-Chieh Hu, Pei-Hsuan Chang, Haozheng Luo et al.

ICML 2024arXiv:2404.03828
42
citations