ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.13905
  4. Cited By
Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity
  Bias

Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias

26 October 2021
Kaifeng Lyu
Zhiyuan Li
Runzhe Wang
Sanjeev Arora
    MLT
ArXivPDFHTML

Papers citing "Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias"

21 / 21 papers shown
Title
Learning Guarantee of Reward Modeling Using Deep Neural Networks
Learning Guarantee of Reward Modeling Using Deep Neural Networks
Yuanhang Luo
Yeheng Ge
Ruijian Han
Guohao Shen
34
0
0
10 May 2025
Gradient Descent Robustly Learns the Intrinsic Dimension of Data in Training Convolutional Neural Networks
Gradient Descent Robustly Learns the Intrinsic Dimension of Data in Training Convolutional Neural Networks
Chenyang Zhang
Peifeng Gao
Difan Zou
Yuan Cao
OOD
MLT
62
0
0
11 Apr 2025
Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization
Implicit Bias of AdamW: ℓ∞\ell_\inftyℓ∞​ Norm Constrained Optimization
Shuo Xie
Zhiyuan Li
OffRL
50
13
0
05 Apr 2024
Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations
Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations
Akshay Kumar
Jarvis Haupt
ODL
44
3
0
12 Mar 2024
Neural Redshift: Random Networks are not Random Functions
Neural Redshift: Random Networks are not Random Functions
Damien Teney
A. Nicolicioiu
Valentin Hartmann
Ehsan Abbasnejad
103
18
0
04 Mar 2024
An extended asymmetric sigmoid with Perceptron (SIGTRON) for imbalanced
  linear classification
An extended asymmetric sigmoid with Perceptron (SIGTRON) for imbalanced linear classification
Hyenkyun Woo
20
0
0
26 Dec 2023
Early Neuron Alignment in Two-layer ReLU Networks with Small
  Initialization
Early Neuron Alignment in Two-layer ReLU Networks with Small Initialization
Hancheng Min
Enrique Mallada
René Vidal
MLT
34
19
0
24 Jul 2023
Phase Diagram of Initial Condensation for Two-layer Neural Networks
Phase Diagram of Initial Condensation for Two-layer Neural Networks
Zheng Chen
Yuqing Li
Tao Luo
Zhaoguang Zhou
Z. Xu
MLT
AI4CE
49
8
0
12 Mar 2023
Understanding Incremental Learning of Gradient Descent: A Fine-grained
  Analysis of Matrix Sensing
Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing
Jikai Jin
Zhiyuan Li
Kaifeng Lyu
S. Du
Jason D. Lee
MLT
54
34
0
27 Jan 2023
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for
  Language Models
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
40
49
0
25 Oct 2022
On the Implicit Bias in Deep-Learning Algorithms
On the Implicit Bias in Deep-Learning Algorithms
Gal Vardi
FedML
AI4CE
34
72
0
26 Aug 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On
  Equivalence to Mirror Descent
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
42
27
0
08 Jul 2022
Understanding the Generalization Benefit of Normalization Layers:
  Sharpness Reduction
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Kaifeng Lyu
Zhiyuan Li
Sanjeev Arora
FAtt
40
70
0
14 Jun 2022
Adversarial Reprogramming Revisited
Adversarial Reprogramming Revisited
Matthias Englert
R. Lazic
AAML
29
8
0
07 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and
  orthogonal inputs
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs
Etienne Boursier
Loucas Pillaud-Vivien
Nicolas Flammarion
ODL
24
58
0
02 Jun 2022
Empirical Phase Diagram for Three-layer Neural Networks with Infinite
  Width
Empirical Phase Diagram for Three-layer Neural Networks with Infinite Width
Hanxu Zhou
Qixuan Zhou
Zhenyuan Jin
Tao Luo
Yaoyu Zhang
Zhi-Qin John Xu
25
20
0
24 May 2022
Random Feature Amplification: Feature Learning and Generalization in
  Neural Networks
Random Feature Amplification: Feature Learning and Generalization in Neural Networks
Spencer Frei
Niladri S. Chatterji
Peter L. Bartlett
MLT
30
29
0
15 Feb 2022
Implicit Regularization in Hierarchical Tensor Factorization and Deep
  Convolutional Neural Networks
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks
Noam Razin
Asaf Maman
Nadav Cohen
46
29
0
27 Jan 2022
On Margin Maximization in Linear and ReLU Networks
On Margin Maximization in Linear and ReLU Networks
Gal Vardi
Ohad Shamir
Nathan Srebro
50
28
0
06 Oct 2021
Towards Understanding Learning in Neural Networks with Linear Teachers
Towards Understanding Learning in Neural Networks with Linear Teachers
Roei Sarussi
Alon Brutzkus
Amir Globerson
FedML
MLT
55
20
0
07 Jan 2021
Provable Generalization of SGD-trained Neural Networks of Any Width in
  the Presence of Adversarial Label Noise
Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise
Spencer Frei
Yuan Cao
Quanquan Gu
FedML
MLT
64
19
0
04 Jan 2021
1