Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss

11 February 2020

Papers citing "Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss"

50 / 252 papers shown

Title
Minimum norm interpolation by perceptra: Explicit regularization and implicit bias Jiyoung Park Ian Pelakh Stephan Wojtowytsch 45 2 0 10 Nov 2023
A Quadratic Synchronization Rule for Distributed Deep Learning Xinran Gu Kaifeng Lyu Sanjeev Arora Jingzhao Zhang Longbo Huang 54 1 0 22 Oct 2023
Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit Blake Bordelon Lorenzo Noci Mufan Li Boris Hanin Cengiz Pehlevan 35 23 0 28 Sep 2023
SGD Finds then Tunes Features in Two-Layer Neural Networks with near-Optimal Sample Complexity: A Case Study in the XOR problem Margalit Glasgow MLT 82 13 0 26 Sep 2023
$Globally Convergent Accelerated Algorithms for Multilinear Sparse Logistic Regression with $\ell_0$-constraints$ Globally Convergent Accelerated Algorithms for Multilinear Sparse Logistic Regression with $\ell_0$ -constraints Weifeng Yang Wenwen Min 18 0 0 17 Sep 2023
Connecting NTK and NNGP: A Unified Theoretical Framework for Wide Neural Network Learning Dynamics Yehonatan Avidan Qianyi Li H. Sompolinsky 60 8 0 08 Sep 2023
Six Lectures on Linearized Neural Networks Theodor Misiakiewicz Andrea Montanari 42 12 0 25 Aug 2023
An Exact Kernel Equivalence for Finite Classification Models Brian Bell Michaela Geyer David Glickenstein Amanda Fernandez Juston Moore 27 2 0 01 Aug 2023
Noisy Interpolation Learning with Shallow Univariate ReLU Networks Nirmit Joshi Gal Vardi Nathan Srebro 32 8 0 28 Jul 2023
Trajectory Alignment: Understanding the Edge of Stability Phenomenon via Bifurcation Theory Minhak Song Chulhee Yun 31 9 1 09 Jul 2023
Abide by the Law and Follow the Flow: Conservation Laws for Gradient Flows Sibylle Marcotte Rémi Gribonval Gabriel Peyré 30 16 0 30 Jun 2023
Max-Margin Token Selection in Attention Mechanism Davoud Ataee Tarzanagh Yingcong Li Xuechen Zhang Samet Oymak 40 38 0 23 Jun 2023
Scaling MLPs: A Tale of Inductive Bias Gregor Bachmann Sotiris Anagnostidis Thomas Hofmann 34 38 0 23 Jun 2023
The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks Yuan Cao Difan Zou Yuan-Fang Li Quanquan Gu MLT 37 5 0 20 Jun 2023
The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions Nishil Patel Sebastian Lee Stefano Sarao Mannelli Sebastian Goldt Adrew Saxe OffRL 28 3 0 17 Jun 2023
Exact Count of Boundary Pieces of ReLU Classifiers: Towards the Proper Complexity Measure for Classification Paweł Piwek Adam Klukowski Tianyang Hu 11 5 0 15 Jun 2023
A Mathematical Abstraction for Balancing the Trade-off Between Creativity and Reality in Large Language Models Ritwik Sinha Zhao-quan Song Dinesh Manocha 22 23 0 04 Jun 2023
Bottleneck Structure in Learned Features: Low-Dimension vs Regularity Tradeoff Arthur Jacot MLT 26 13 0 30 May 2023
A Rainbow in Deep Network Black Boxes Florentin Guth Brice Ménard G. Rochette S. Mallat 27 10 0 29 May 2023
On the Role of Noise in the Sample Complexity of Learning Recurrent Neural Networks: Exponential Gaps for Long Sequences A. F. Pour H. Ashtiani 21 0 0 28 May 2023
Fast Convergence in Learning Two-Layer Neural Networks with Separable Data Hossein Taheri Christos Thrampoulidis MLT 16 3 0 22 May 2023
The Training Process of Many Deep Networks Explores the Same Low-Dimensional Manifold Jialin Mao Itay Griniasty H. Teoh Rahul Ramesh Rubing Yang Mark K. Transtrum James P. Sethna Pratik Chaudhari 3DPC 42 15 0 02 May 2023
Saddle-to-Saddle Dynamics in Diagonal Linear Networks Scott Pesme Nicolas Flammarion 31 35 0 02 Apr 2023
On the Effect of Initialization: The Scaling Path of 2-Layer Neural Networks Sebastian Neumayer Lénaïc Chizat M. Unser 27 2 0 31 Mar 2023
Solving Regularized Exp, Cosh and Sinh Regression Problems Zhihang Li Zhao-quan Song Dinesh Manocha 31 39 0 28 Mar 2023
Global Optimality of Elman-type RNN in the Mean-Field Regime Andrea Agazzi Jian-Xiong Lu Sayan Mukherjee MLT 34 1 0 12 Mar 2023
Benign Overfitting in Linear Classifiers and Leaky ReLU Networks from KKT Conditions for Margin Maximization Spencer Frei Gal Vardi Peter L. Bartlett Nathan Srebro 30 22 0 02 Mar 2023
The Double-Edged Sword of Implicit Bias: Generalization vs. Robustness in ReLU Networks Spencer Frei Gal Vardi Peter L. Bartlett Nathan Srebro 37 17 0 02 Mar 2023
Penalising the biases in norm regularisation enforces sparsity Etienne Boursier Nicolas Flammarion 37 14 0 02 Mar 2023
On the existence of minimizers in shallow residual ReLU neural network optimization landscapes Steffen Dereich Arnulf Jentzen Sebastian Kassing 29 6 0 28 Feb 2023
SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics Emmanuel Abbe Enric Boix-Adserà Theodor Misiakiewicz FedML MLT 79 73 0 21 Feb 2023
Generalization and Stability of Interpolating Neural Networks with Minimal Width Hossein Taheri Christos Thrampoulidis 37 16 0 18 Feb 2023
Stochastic Modified Flows, Mean-Field Limits and Dynamics of Stochastic Gradient Descent Benjamin Gess Sebastian Kassing Vitalii Konarovskyi DiffM 32 6 0 14 Feb 2023
Efficient displacement convex optimization with particle gradient descent Hadi Daneshmand J. Lee Chi Jin 26 5 0 09 Feb 2023
Simplicity Bias in 1-Hidden Layer Neural Networks Depen Morwani Jatin Batra Prateek Jain Praneeth Netrapalli 21 17 0 01 Feb 2023
Naive imputation implicitly regularizes high-dimensional linear models Alexis Ayme Claire Boyer Aymeric Dieuleveut Erwan Scornet AI4CE 9 6 0 31 Jan 2023
Generalization on the Unseen, Logic Reasoning and Degree Curriculum Emmanuel Abbe Samy Bengio Aryo Lotfi Kevin Rizk LRM 39 49 0 30 Jan 2023
Deep Learning Meets Sparse Regularization: A Signal Processing Perspective Rahul Parhi Robert D. Nowak 40 25 0 23 Jan 2023
Understanding Difficulty-based Sample Weighting with a Universal Difficulty Measure Xiaoling Zhou Ou Wu Weiyao Zhu Ziyang Liang 27 2 0 12 Jan 2023
Iterative regularization in classification via hinge loss diagonal descent Vassilis Apidopoulos T. Poggio Lorenzo Rosasco S. Villa 26 2 0 24 Dec 2022
Graph Neural Networks are Inherently Good Generalizers: Insights by Bridging GNNs and MLPs Chenxiao Yang Qitian Wu Jiahua Wang Junchi Yan AI4CE 19 51 0 18 Dec 2022
On the symmetries in the dynamics of wide two-layer neural networks Karl Hajjar Lénaïc Chizat 15 11 0 16 Nov 2022
Regression as Classification: Influence of Task Formulation on Neural Network Features Lawrence Stewart Francis R. Bach Quentin Berthet Jean-Philippe Vert 29 24 0 10 Nov 2022
Duality for Neural Networks through Reproducing Kernel Banach Spaces L. Spek T. J. Heeringa Felix L. Schwenninger C. Brune 21 13 0 09 Nov 2022
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks Zhengdao Chen Eric Vanden-Eijnden Joan Bruna MLT 25 5 0 28 Oct 2022
Learning Single-Index Models with Shallow Neural Networks A. Bietti Joan Bruna Clayton Sanford M. Song 167 67 0 27 Oct 2022
Vision Transformers provably learn spatial structure Samy Jelassi Michael E. Sander Yuan-Fang Li ViT MLT 34 74 0 13 Oct 2022
Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data Spencer Frei Gal Vardi Peter L. Bartlett Nathan Srebro Wei Hu MLT 30 38 0 13 Oct 2022
Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence Diyuan Wu Vyacheslav Kungurtsev Marco Mondelli 23 3 0 13 Oct 2022
SGD with Large Step Sizes Learns Sparse Features Maksym Andriushchenko Aditya Varre Loucas Pillaud-Vivien Nicolas Flammarion 45 56 0 11 Oct 2022