Feature Learning in Infinite-Width Neural Networks

30 November 2020

Papers citing "Feature Learning in Infinite-Width Neural Networks"

45 / 45 papers shown

Title
Power Lines: Scaling Laws for Weight Decay and Batch Size in LLM Pre-training Shane Bergsma Nolan Dey Gurpreet Gosal Gavia Gray Daria Soboleva Joel Hestness 21 0 0 19 May 2025
Learning curves theory for hierarchically compositional data with power-law distributed features Francesco Cagnetta Hyunmo Kang M. Wyart 43 0 0 11 May 2025
Don't be lazy: CompleteP enables compute-efficient deep transformers Nolan Dey Bin Claire Zhang Lorenzo Noci Mufan Li Blake Bordelon Shane Bergsma Cengiz Pehlevan Boris Hanin Joel Hestness 44 1 0 02 May 2025
Physics of Skill Learning Ziming Liu Yizhou Liu Eric J. Michaud Jeff Gore Max Tegmark 54 2 0 21 Jan 2025
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit Oleg Filatov Jan Ebert Jiangtao Wang Stefan Kesselheim 44 4 0 10 Jan 2025
Local Loss Optimization in the Infinite Width: Stable Parameterization of Predictive Coding Networks and Target Propagation Satoki Ishikawa Rio Yokota Ryo Karakida 46 0 0 04 Nov 2024
Formation of Representations in Neural Networks Liu Ziyin Isaac Chuang Tomer Galanti T. Poggio 39 4 0 03 Oct 2024
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks Clémentine Dominé Nicolas Anguita A. Proca Lukas Braun D. Kunin P. Mediano Andrew M. Saxe 38 3 0 22 Sep 2024
Hyperparameter Optimization for Randomized Algorithms: A Case Study on Random Features Oliver R. A. Dunbar Nicholas H. Nelsen Maya Mutic 37 5 0 30 Jun 2024
Infinite Width Models That Work: Why Feature Learning Doesn't Matter as Much as You Think Luke Sernau 21 0 0 27 Jun 2024
Understanding and Minimising Outlier Features in Neural Network Training Bobby He Lorenzo Noci Daniele Paliotta Imanol Schlag Thomas Hofmann 42 3 0 29 May 2024
Bayesian RG Flow in Neural Network Field Theories Jessica N. Howard Marc S. Klinger Anindita Maiti A. G. Stapleton 68 1 0 27 May 2024
Infinite Limits of Multi-head Transformer Dynamics Blake Bordelon Hamza Tahir Chaudhry Cengiz Pehlevan AI4CE 53 9 0 24 May 2024
Understanding Optimal Feature Transfer via a Fine-Grained Bias-Variance Analysis Yufan Li Subhabrata Sen Ben Adlam MLT 51 1 0 18 Apr 2024
Principled Architecture-aware Scaling of Hyperparameters Wuyang Chen Junru Wu Zhangyang Wang Boris Hanin AI4CE 49 0 0 27 Feb 2024
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding Zhengqing Wu Berfin Simsek Francois Ged ODL 48 0 0 08 Feb 2024
A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks Behrad Moniri Donghwan Lee Hamed Hassani Yan Sun MLT 45 19 0 11 Oct 2023
Do deep neural networks have an inbuilt Occam's razor? Chris Mingard Henry Rees Guillermo Valle Pérez A. Louis UQCV BDL 21 16 0 13 Apr 2023
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks Blake Bordelon Cengiz Pehlevan MLT 38 29 0 06 Apr 2023
Unit Scaling: Out-of-the-Box Low-Precision Training Charlie Blake Douglas Orr Carlo Luschi MQ 24 7 0 20 Mar 2023
PAPAL: A Provable PArticle-based Primal-Dual ALgorithm for Mixed Nash Equilibrium Shihong Ding Hanze Dong Cong Fang Zhouchen Lin Tong Zhang 38 1 0 02 Mar 2023
Learning time-scales in two-layers neural networks Raphael Berthier Andrea Montanari Kangjie Zhou 38 33 0 28 Feb 2023
How to prepare your task head for finetuning Yi Ren Shangmin Guo Wonho Bae Danica J. Sutherland 24 14 0 11 Feb 2023
On the Geometry of Reinforcement Learning in Continuous State and Action Spaces Saket Tiwari Omer Gottesman George Konidaris 26 0 0 29 Dec 2022
Evolution of Neural Tangent Kernels under Benign and Adversarial Training Noel Loo Ramin Hasani Alexander Amini Daniela Rus AAML 36 13 0 21 Oct 2022
GULP: a prediction-based metric between representations Enric Boix Adserà Hannah Lawrence George Stepaniants Philippe Rigollet 46 11 0 12 Oct 2022
Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks Blake Bordelon Cengiz Pehlevan MLT 40 77 0 19 May 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation Jimmy Ba Murat A. Erdogdu Taiji Suzuki Zhichao Wang Denny Wu Greg Yang MLT 42 121 0 03 May 2022
Analytic theory for the dynamics of wide quantum neural networks Junyu Liu K. Najafi Kunal Sharma F. Tacchino Liang Jiang Antonio Mezzacapo 36 52 0 30 Mar 2022
Random matrix analysis of deep neural network weight matrices M. Thamm Max Staats B. Rosenow 37 12 0 28 Mar 2022
Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer Greg Yang J. E. Hu Igor Babuschkin Szymon Sidor Xiaodong Liu David Farhi Nick Ryder J. Pachocki Weizhu Chen Jianfeng Gao 26 149 0 07 Mar 2022
Random Feature Amplification: Feature Learning and Generalization in Neural Networks Spencer Frei Niladri S. Chatterji Peter L. Bartlett MLT 30 29 0 15 Feb 2022
On the Equivalence between Neural Network and Support Vector Machine Yilan Chen Wei Huang Lam M. Nguyen Tsui-Wei Weng AAML 25 18 0 11 Nov 2021
Neural Networks as Kernel Learners: The Silent Alignment Effect Alexander B. Atanasov Blake Bordelon Cengiz Pehlevan MLT 26 75 0 29 Oct 2021
A Mechanism for Producing Aligned Latent Spaces with Autoencoders Saachi Jain Adityanarayanan Radhakrishnan Caroline Uhler 24 9 0 29 Jun 2021
How to Train Your Wide Neural Network Without Backprop: An Input-Weight Alignment Perspective Akhilan Boopathy Ila Fiete 44 9 0 15 Jun 2021
The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective Geoff Pleiss John P. Cunningham 28 24 0 11 Jun 2021
A Neural Tangent Kernel Perspective of GANs Jean-Yves Franceschi Emmanuel de Bézenac Ibrahim Ayed Mickaël Chen Sylvain Lamprier Patrick Gallinari 37 26 0 10 Jun 2021
A self consistent theory of Gaussian Processes captures feature learning effects in finite CNNs Gadi Naveh Zohar Ringel SSL MLT 36 31 0 08 Jun 2021
The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization Mufan Li Mihai Nica Daniel M. Roy 35 33 0 07 Jun 2021
Priors in Bayesian Deep Learning: A Review Vincent Fortuin UQCV BDL 33 124 0 14 May 2021
When Does Preconditioning Help or Hurt Generalization? S. Amari Jimmy Ba Roger C. Grosse Xuechen Li Atsushi Nitanda Taiji Suzuki Denny Wu Ji Xu 36 32 0 18 Jun 2020
The large learning rate phase of deep learning: the catapult mechanism Aitor Lewkowycz Yasaman Bahri Ethan Dyer Jascha Narain Sohl-Dickstein Guy Gur-Ari ODL 159 235 0 04 Mar 2020
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks Chelsea Finn Pieter Abbeel Sergey Levine OOD 457 11,715 0 09 Mar 2017
Efficient Estimation of Word Representations in Vector Space Tomáš Mikolov Kai Chen G. Corrado J. Dean 3DV 308 31,280 0 16 Jan 2013