Neural networks trained with SGD learn distributions of increasing
complexity

Neural networks trained with SGD learn distributions of increasing complexity

21 November 2022

Maria Refinetti

Alessandro Ingrosso

Sebastian Goldt

Papers citing "Neural networks trained with SGD learn distributions of increasing complexity"

14 / 14 papers shown

Title
Scaling Laws and Representation Learning in Simple Hierarchical Languages: Transformers vs. Convolutional Architectures Francesco Cagnetta Alessandro Favero Antonio Sclocchi M. Wyart 26 0 0 11 May 2025
A distributional simplicity bias in the learning dynamics of transformers Riccardo Rende Federica Gerace A. Laio Sebastian Goldt 79 8 0 17 Feb 2025
The Optimization Landscape of SGD Across the Feature Learning Strength Alexander B. Atanasov Alexandru Meterez James B. Simon Cengiz Pehlevan 43 2 0 06 Oct 2024
How Feature Learning Can Improve Neural Scaling Laws Blake Bordelon Alexander B. Atanasov Cengiz Pehlevan 57 12 0 26 Sep 2024
Dynamics of Supervised and Reinforcement Learning in the Non-Linear Perceptron Christian Schmid James M. Murray 40 0 0 05 Sep 2024
Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences Nikolaos Dimitriadis Pascal Frossard F. Fleuret MoE 67 6 0 10 Jul 2024
Context-Aware Machine Translation with Source Coreference Explanation Huy Hien Vu Hidetaka Kamigaito Taro Watanabe LRM 44 2 0 30 Apr 2024
Do deep neural networks have an inbuilt Occam's razor? Chris Mingard Henry Rees Guillermo Valle Pérez A. Louis UQCV BDL 21 16 0 13 Apr 2023
A Mathematical Model for Curriculum Learning for Parities Elisabetta Cornacchia Elchanan Mossel 40 10 0 31 Jan 2023
Data-driven emergence of convolutional structure in neural networks Alessandro Ingrosso Sebastian Goldt 56 38 0 01 Feb 2022
The Intrinsic Dimension of Images and Its Impact on Learning Phillip E. Pope Chen Zhu Ahmed Abdelkader Micah Goldblum Tom Goldstein 197 261 0 18 Apr 2021
Pre-training without Natural Images Hirokatsu Kataoka Kazushige Okayasu Asato Matsumoto Eisuke Yamagata Ryosuke Yamada Nakamasa Inoue Akio Nakamura Y. Satoh 79 117 0 21 Jan 2021
Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks Blake Bordelon Abdulkadir Canatar Cengiz Pehlevan 146 201 0 07 Feb 2020
Densely Connected Convolutional Networks Gao Huang Zhuang Liu L. V. D. van der Maaten Kilian Q. Weinberger PINN 3DV 315 36,381 0 25 Aug 2016