"feature learning" Papers

22 papers found

A Rainbow in Deep Network Black Boxes

Florentin Guth, Brice Ménard, Gaspar Rochette et al.

ICLR 2025arXiv:2305.18512
19
citations

Deep Networks Learn Features From Local Discontinuities in the Label Function

Prithaj Banerjee, Harish G Ramaswamy, Mahesh Yadav et al.

ICLR 2025

Do Mice Grok? Glimpses of Hidden Progress in Sensory Cortex

Tanishq Kumar, Blake Bordelon, Cengiz Pehlevan et al.

ICLR 2025
1
citations

From Linear to Nonlinear: Provable Weak-to-Strong Generalization through Feature Learning

Junsoo Oh, Jerry Song, Chulhee Yun

NEURIPS 2025arXiv:2510.24812
2
citations

How Data Mixing Shapes In-Context Learning: Asymptotic Equivalence for Transformers with MLPs

Samet Demir, Zafer Dogan

NEURIPS 2025arXiv:2510.25753

Learning Hierarchical Polynomials of Multiple Nonlinear Features

Hengyu Fu, Zihao Wang, Eshaan Nichani et al.

ICLR 2025arXiv:2411.17201
4
citations

On the Feature Learning in Diffusion Models

Andi Han, Wei Huang, Yuan Cao et al.

ICLR 2025arXiv:2412.01021
14
citations

Revisiting Residual Connections: Orthogonal Updates for Stable and Efficient Deep Networks

Giyeong Oh, Woohyun Cho, Siyeol Kim et al.

NEURIPS 2025arXiv:2505.11881

Robust Feature Learning for Multi-Index Models in High Dimensions

Alireza Mousavi-Hosseini, Adel Javanmard, Murat A Erdogdu

ICLR 2025arXiv:2410.16449
5
citations

The Computational Advantage of Depth in Learning High-Dimensional Hierarchical Targets

Yatin Dandi, Luca Pesce, Lenka Zdeborová et al.

NEURIPS 2025spotlight

TS-MOF: Two-Stage Multi-Objective Fine-tuning for Long-Tailed Recognition

Zhe Zhao, Zhiheng Gong, Pengkun Wang et al.

NEURIPS 2025

Asymptotics of feature learning in two-layer networks after one gradient-step

Hugo Cui, Luca Pesce, Yatin Dandi et al.

ICML 2024spotlightarXiv:2402.04980
26
citations

A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks

Behrad Moniri, Donghwan Lee, Hamed Hassani et al.

ICML 2024arXiv:2310.07891
35
citations

Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning

Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan et al.

ICML 2024arXiv:2306.04815
25
citations

DeiT-LT: Distillation Strikes Back for Vision Transformer Training on Long-Tailed Datasets

Harsh Rangwani, Pradipto Mondal, Mayank Mishra et al.

CVPR 2024arXiv:2404.02900
18
citations

Diffusion Models Demand Contrastive Guidance for Adversarial Purification to Advance

Mingyuan Bai, Wei Huang, Li Tenghui et al.

ICML 2024

LoRA+: Efficient Low Rank Adaptation of Large Models

Soufiane Hayou, Nikhil Ghosh, Bin Yu

ICML 2024arXiv:2402.12354
341
citations

Mean-field Analysis on Two-layer Neural Networks from a Kernel Perspective

Shokichi Takakura, Taiji Suzuki

ICML 2024arXiv:2403.14917
8
citations

Mean Field Langevin Actor-Critic: Faster Convergence and Global Optimality beyond Lazy Learning

Kakei Yamamoto, Kazusato Oko, Zhuoran Yang et al.

ICML 2024oral

Provable Benefits of Local Steps in Heterogeneous Federated Learning for Neural Networks: A Feature Learning Perspective

Yajie Bao, Michael Crawshaw, Mingrui Liu

ICML 2024

Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks

Liam Collins, Hamed Hassani, Mahdi Soltanolkotabi et al.

ICML 2024arXiv:2307.06887
13
citations

Provably Neural Active Learning Succeeds via Prioritizing Perplexing Samples

Dake Bu, Wei Huang, Taiji Suzuki et al.

ICML 2024arXiv:2406.03944
3
citations