"gradient vanishing" Papers
4 papers found
Conference
Hyper-Connections
Defa Zhu, Hongzhi Huang, Zihao Huang et al.
ICLR 2025arXiv:2409.19606
33
citations
Autaptic Synaptic Circuit Enhances Spatio-temporal Predictive Learning of Spiking Neural Networks
Lihao Wang, Zhaofei Yu
ICML 2024oralarXiv:2406.00405
9
citations
Bi-ViT: Pushing the Limit of Vision Transformer Quantization
Yanjing Li, Sheng Xu, Mingbao Lin et al.
AAAI 2024paperarXiv:2305.12354
22
citations
DiNADO: Norm-Disentangled Neurally-Decomposed Oracles for Controlling Language Models
Sidi Lu, Wenbo Zhao, Chenyang Tao et al.
ICML 2024arXiv:2306.11825
1
citations