A mathematical theory of semantic development in deep neural networks

23 October 2018

Papers citing "A mathematical theory of semantic development in deep neural networks"

12 / 12 papers shown

Title
The emergence of sparse attention: impact of data distribution and benefits of repetition Nicolas Zucchet Francesco dÁngelo Andrew Kyle Lampinen Stephanie C. Y. Chan 107 0 0 23 May 2025
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking) Yoonsoo Nam Seok Hyeong Lee Clementine Domine Yea Chan Park Charles London Wonyl Choi Niclas Goring Seungjai Lee AI4CE 106 0 0 28 Feb 2025
A distributional simplicity bias in the learning dynamics of transformers Riccardo Rende Federica Gerace Alessandro Laio Sebastian Goldt 95 8 0 17 Feb 2025
Flexible task abstractions emerge in linear networks with fast and bounded units Kai Sandbrink Jan P. Bauer A. Proca Andrew M. Saxe Christopher Summerfield Ali Hummos 83 2 0 17 Jan 2025
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks Clémentine Dominé Nicolas Anguita A. Proca Lukas Braun D. Kunin P. Mediano Andrew M. Saxe 72 3 0 22 Sep 2024
Dynamics of Supervised and Reinforcement Learning in the Non-Linear Perceptron Christian Schmid James M. Murray 52 0 0 05 Sep 2024
Masked Mixers for Language Generation and Retrieval Benjamin L. Badger 92 0 0 02 Sep 2024
Bridging Neuroscience and AI: Environmental Enrichment as a Model for Forward Knowledge Transfer Rajat Saxena Bruce L. McNaughton CLL 90 2 0 12 May 2024
How Do Recommendation Models Amplify Popularity Bias? An Analysis from the Spectral Perspective Siyi Lin Chongming Gao Jiawei Chen Sheng Zhou Binbin Hu Yan Feng Chun-Yen Chen Can Wang 81 8 0 18 Apr 2024
Fundamental Limits of Deep Learning-Based Binary Classifiers Trained with Hinge Loss T. Getu Georges Kaddoum M. Bennis 62 1 0 13 Sep 2023
The Implicit Regularization of Stochastic Gradient Flow for Least Squares Alnur Ali Yan Sun Robert Tibshirani 56 77 0 17 Mar 2020
Exponential expressivity in deep neural networks through transient chaos Ben Poole Subhaneil Lahiri M. Raghu Jascha Narain Sohl-Dickstein Surya Ganguli 83 587 0 16 Jun 2016