"activation functions" Papers

14 papers found

From Kolmogorov to Cauchy: Shallow XNet Surpasses KANs

Xin Li, Xiaotao Zheng, Zhihong Xia

NEURIPS 2025

Generalization Bounds and Model Complexity for Kolmogorov–Arnold Networks

Xianyang Zhang, Huijuan Zhou

ICLR 2025arXiv:2410.08026
7
citations

Global Convergence in Neural ODEs: Impact of Activation Functions

Tianxiang Gao, Siyuan Sun, Hailiang Liu et al.

ICLR 2025arXiv:2509.22436
3
citations

LayerAct: Advanced Activation Mechanism for Robust Inference of CNNs

Kihyuk Yoon, Chiehyeon Lim

AAAI 2025paperarXiv:2306.04940

Plastic Learning with Deep Fourier Features

Alex Lewandowski, Dale Schuurmans, Marlos C. Machado

ICLR 2025arXiv:2410.20634
12
citations

Representing Sounds as Neural Amplitude Fields: A Benchmark of Coordinate-MLPs and a Fourier Kolmogorov-Arnold Framework

Linfei Li, Lin Zhang, Zhong Wang et al.

AAAI 2025paperarXiv:2601.06406
3
citations

Separation Power of Equivariant Neural Networks

Marco Pacini, Xiaowen Dong, Bruno Lepri et al.

ICLR 2025arXiv:2406.08966
1
citations

A sampling theory perspective on activations for implicit neural representations

Hemanth Saratchandran, Sameera Ramasinghe, Violetta Shevchenko et al.

ICML 2024arXiv:2402.05427
21
citations

From Activation to Initialization: Scaling Insights for Optimizing Neural Fields

Hemanth Saratchandran, Sameera Ramasinghe, Simon Lucey

CVPR 2024highlightarXiv:2403.19205
11
citations

Initial Guessing Bias: How Untrained Networks Favor Some Classes

Emanuele Francazi, Aurelien Lucchi, Marco Baity-Jesi

ICML 2024arXiv:2306.00809
8
citations

Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?

Huy Nguyen, Pedram Akbarian, Nhat Ho

ICML 2024arXiv:2401.13875
20
citations

On dimensionality of feature vectors in MPNNs

César Bravo, Alexander Kozachinskiy, Cristobal Rojas

ICML 2024arXiv:2402.03966
8
citations

ReLU Network with Width $d+\mathcal{O}(1)$ Can Achieve Optimal Approximation Rate

Chenghao Liu, Minghua Chen

ICML 2024

Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context

Xiang Cheng, Yuxin Chen, Suvrit Sra

ICML 2024arXiv:2312.06528
63
citations