Poster "transformer-based models" Papers
10 papers found
Conference
AI Progress Should Be Measured by Capability-Per-Resource, Not Scale Alone: A Framework for Gradient-Guided Resource Allocation in LLMs
David McCoy, Yulun Wu, Zachary Butzin-Dozier
NEURIPS 2025arXiv:2511.01077
Enhancing the Maximum Effective Window for Long-Term Time Series Forecasting
Jiahui Zhang, Zhengyang Zhou, Wenjie Du et al.
NEURIPS 2025
From Promise to Practice: Realizing High-performance Decentralized Training
Zesen Wang, Jiaojiao Zhang, Xuyang Wu et al.
ICLR 2025arXiv:2410.11998
3
citations
Retrieval Head Mechanistically Explains Long-Context Factuality
Wenhao Wu, Yizhong Wang, Guangxuan Xiao et al.
ICLR 2025arXiv:2404.15574
150
citations
UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming
Hao Lin, Ke Wu, Jie Li et al.
CVPR 2025arXiv:2307.16375
4
citations
EgoPoseFormer: A Simple Baseline for Stereo Egocentric 3D Human Pose Estimation
Chenhongyi Yang, Anastasia Tkach, Shreyas Hampali et al.
ECCV 2024arXiv:2403.18080
5
citations
FedRA: A Random Allocation Strategy for Federated Tuning to Unleash the Power of Heterogeneous Clients
Shangchao Su, Bin Li, Xiangyang Xue
ECCV 2024arXiv:2311.11227
21
citations
Harnessing Joint Rain-/Detail-aware Representations to Eliminate Intricate Rains
Wu Ran, Peirong Ma, Zhiquan He et al.
ICLR 2024arXiv:2404.12091
4
citations
IM-Unpack: Training and Inference with Arbitrarily Low Precision Integers
Zhanpeng Zeng, Karthikeyan Sankaralingam, Vikas Singh
ICML 2024arXiv:2403.07339
1
citations
Outlier-Efficient Hopfield Layers for Large Transformer-Based Models
Jerry Yao-Chieh Hu, Pei-Hsuan Chang, Haozheng Luo et al.
ICML 2024arXiv:2404.03828
42
citations