ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.09545
  4. Cited By
On the Global Convergence of Gradient Descent for Over-parameterized
  Models using Optimal Transport

On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport

24 May 2018
Lénaïc Chizat
Francis R. Bach
    OT
ArXivPDFHTML

Papers citing "On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport"

50 / 483 papers shown
Title
Gradient flows on graphons: existence, convergence, continuity equations
Gradient flows on graphons: existence, convergence, continuity equations
Sewoong Oh
Soumik Pal
Raghav Somani
Raghavendra Tripathi
20
5
0
18 Nov 2021
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks
A. Shevchenko
Vyacheslav Kungurtsev
Marco Mondelli
MLT
38
13
0
03 Nov 2021
Limiting fluctuation and trajectorial stability of multilayer neural
  networks with mean field training
Limiting fluctuation and trajectorial stability of multilayer neural networks with mean field training
H. Pham
Phan-Minh Nguyen
10
6
0
29 Oct 2021
A Riemannian Mean Field Formulation for Two-layer Neural Networks with
  Batch Normalization
A Riemannian Mean Field Formulation for Two-layer Neural Networks with Batch Normalization
Chao Ma
Lexing Ying
MLT
16
2
0
17 Oct 2021
Gradient Descent on Infinitely Wide Neural Networks: Global Convergence
  and Generalization
Gradient Descent on Infinitely Wide Neural Networks: Global Convergence and Generalization
Francis R. Bach
Lénaïc Chizat
MLT
23
23
0
15 Oct 2021
The Convex Geometry of Backpropagation: Neural Network Gradient Flows
  Converge to Extreme Points of the Dual Convex Program
The Convex Geometry of Backpropagation: Neural Network Gradient Flows Converge to Extreme Points of the Dual Convex Program
Yifei Wang
Mert Pilanci
MLT
MDE
55
11
0
13 Oct 2021
Parallel Deep Neural Networks Have Zero Duality Gap
Parallel Deep Neural Networks Have Zero Duality Gap
Yifei Wang
Tolga Ergen
Mert Pilanci
79
10
0
13 Oct 2021
Neural Network Weights Do Not Converge to Stationary Points: An
  Invariant Measure Perspective
Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective
Junzhe Zhang
Haochuan Li
S. Sra
Ali Jadbabaie
66
9
0
12 Oct 2021
AIR-Net: Adaptive and Implicit Regularization Neural Network for Matrix
  Completion
AIR-Net: Adaptive and Implicit Regularization Neural Network for Matrix Completion
Zhemin Li
Tao Sun
Hongxia Wang
Bao Wang
50
6
0
12 Oct 2021
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
  on Pruned Neural Networks
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks
Shuai Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
UQCV
MLT
31
13
0
12 Oct 2021
Tighter Sparse Approximation Bounds for ReLU Neural Networks
Tighter Sparse Approximation Bounds for ReLU Neural Networks
Carles Domingo-Enrich
Youssef Mroueh
99
4
0
07 Oct 2021
On the Global Convergence of Gradient Descent for multi-layer ResNets in
  the mean-field regime
On the Global Convergence of Gradient Descent for multi-layer ResNets in the mean-field regime
Zhiyan Ding
Shi Chen
Qin Li
S. Wright
MLT
AI4CE
41
11
0
06 Oct 2021
Sqrt(d) Dimension Dependence of Langevin Monte Carlo
Sqrt(d) Dimension Dependence of Langevin Monte Carlo
Ruilin Li
H. Zha
Molei Tao
19
30
0
08 Sep 2021
Existence, uniqueness, and convergence rates for gradient flows in the
  training of artificial neural networks with ReLU activation
Existence, uniqueness, and convergence rates for gradient flows in the training of artificial neural networks with ReLU activation
Simon Eberle
Arnulf Jentzen
Adrian Riekert
G. Weiss
36
12
0
18 Aug 2021
Optimizing full 3D SPARKLING trajectories for high-resolution
  T2*-weighted Magnetic Resonance Imaging
Optimizing full 3D SPARKLING trajectories for high-resolution T2*-weighted Magnetic Resonance Imaging
R. ChaithyaG.
P. Weiss
Guillaume Daval-Frérot
Aurélien Massire
A. Vignaud
P. Ciuciu
11
8
0
06 Aug 2021
Interpolation can hurt robust generalization even when there is no noise
Interpolation can hurt robust generalization even when there is no noise
Konstantin Donhauser
Alexandru cTifrea
Michael Aerni
Reinhard Heckel
Fanny Yang
34
14
0
05 Aug 2021
The loss landscape of deep linear neural networks: a second-order
  analysis
The loss landscape of deep linear neural networks: a second-order analysis
E. M. Achour
Franccois Malgouyres
Sébastien Gerchinovitz
ODL
24
9
0
28 Jul 2021
Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural
  Networks: A Tale of Symmetry II
Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II
Yossi Arjevani
M. Field
28
18
0
21 Jul 2021
Efficient Algorithms for Learning Depth-2 Neural Networks with General
  ReLU Activations
Efficient Algorithms for Learning Depth-2 Neural Networks with General ReLU Activations
Pranjal Awasthi
Alex K. Tang
Aravindan Vijayaraghavan
MLT
18
20
0
21 Jul 2021
The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations,
  and Anomalous Diffusion
The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations, and Anomalous Diffusion
D. Kunin
Javier Sagastuy-Breña
Lauren Gillespie
Eshed Margalit
Hidenori Tanaka
Surya Ganguli
Daniel L. K. Yamins
31
15
0
19 Jul 2021
Dual Training of Energy-Based Models with Overparametrized Shallow
  Neural Networks
Dual Training of Energy-Based Models with Overparametrized Shallow Neural Networks
Carles Domingo-Enrich
A. Bietti
Marylou Gabrié
Joan Bruna
Eric Vanden-Eijnden
FedML
35
6
0
11 Jul 2021
Convergence analysis for gradient flows in the training of artificial
  neural networks with ReLU activation
Convergence analysis for gradient flows in the training of artificial neural networks with ReLU activation
Arnulf Jentzen
Adrian Riekert
27
23
0
09 Jul 2021
Continual Learning in the Teacher-Student Setup: Impact of Task
  Similarity
Continual Learning in the Teacher-Student Setup: Impact of Task Similarity
Sebastian Lee
Sebastian Goldt
Andrew M. Saxe
CLL
32
73
0
09 Jul 2021
Provable Convergence of Nesterov's Accelerated Gradient Method for
  Over-Parameterized Neural Networks
Provable Convergence of Nesterov's Accelerated Gradient Method for Over-Parameterized Neural Networks
Xin Liu
Zhisong Pan
Wei Tao
14
8
0
05 Jul 2021
Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization
  Training, Symmetry, and Sparsity
Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity
Arthur Jacot
François Ged
Berfin cSimcsek
Clément Hongler
Franck Gabriel
29
52
0
30 Jun 2021
Small random initialization is akin to spectral learning: Optimization
  and generalization guarantees for overparameterized low-rank matrix
  reconstruction
Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction
Dominik Stöger
Mahdi Soltanolkotabi
ODL
42
75
0
28 Jun 2021
Proxy Convexity: A Unified Framework for the Analysis of Neural Networks
  Trained by Gradient Descent
Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent
Spencer Frei
Quanquan Gu
26
25
0
25 Jun 2021
Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of
  Stochasticity
Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of Stochasticity
Scott Pesme
Loucas Pillaud-Vivien
Nicolas Flammarion
27
99
0
17 Jun 2021
KALE Flow: A Relaxed KL Gradient Flow for Probabilities with Disjoint
  Support
KALE Flow: A Relaxed KL Gradient Flow for Probabilities with Disjoint Support
Pierre Glaser
Michael Arbel
A. Gretton
49
37
0
16 Jun 2021
Extracting Global Dynamics of Loss Landscape in Deep Learning Models
Extracting Global Dynamics of Loss Landscape in Deep Learning Models
Mohammed Eslami
Hamed Eramian
Marcio Gameiro
W. Kalies
Konstantin Mischaikow
23
1
0
14 Jun 2021
Understanding Deflation Process in Over-parametrized Tensor
  Decomposition
Understanding Deflation Process in Over-parametrized Tensor Decomposition
Rong Ge
Y. Ren
Xiang Wang
Mo Zhou
8
17
0
11 Jun 2021
The Limitations of Large Width in Neural Networks: A Deep Gaussian
  Process Perspective
The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective
Geoff Pleiss
John P. Cunningham
28
24
0
11 Jun 2021
On Learnability via Gradient Method for Two-Layer ReLU Neural Networks
  in Teacher-Student Setting
On Learnability via Gradient Method for Two-Layer ReLU Neural Networks in Teacher-Student Setting
Shunta Akiyama
Taiji Suzuki
MLT
19
13
0
11 Jun 2021
Ghosts in Neural Networks: Existence, Structure and Role of
  Infinite-Dimensional Null Space
Ghosts in Neural Networks: Existence, Structure and Role of Infinite-Dimensional Null Space
Sho Sonoda
Isao Ishikawa
Masahiro Ikeda
BDL
17
9
0
09 Jun 2021
LEADS: Learning Dynamical Systems that Generalize Across Environments
LEADS: Learning Dynamical Systems that Generalize Across Environments
Yuan Yin
Ibrahim Ayed
Emmanuel de Bézenac
Nicolas Baskiotis
Patrick Gallinari
OOD
18
29
0
08 Jun 2021
The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width
  Limit at Initialization
The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization
Mufan Li
Mihai Nica
Daniel M. Roy
30
33
0
07 Jun 2021
Heavy Tails in SGD and Compressibility of Overparametrized Neural
  Networks
Heavy Tails in SGD and Compressibility of Overparametrized Neural Networks
Melih Barsbey
Milad Sefidgaran
Murat A. Erdogdu
Gaël Richard
Umut Simsekli
14
41
0
07 Jun 2021
Redundant representations help generalization in wide neural networks
Redundant representations help generalization in wide neural networks
Diego Doimo
Aldo Glielmo
Sebastian Goldt
A. Laio
AI4CE
27
9
0
07 Jun 2021
Stochastic gradient descent with noise of machine learning type. Part
  II: Continuous time analysis
Stochastic gradient descent with noise of machine learning type. Part II: Continuous time analysis
Stephan Wojtowytsch
33
33
0
04 Jun 2021
Neural Collapse Under MSE Loss: Proximity to and Dynamics on the Central
  Path
Neural Collapse Under MSE Loss: Proximity to and Dynamics on the Central Path
X. Y. Han
Vardan Papyan
D. Donoho
AAML
30
136
0
03 Jun 2021
Embedding Principle of Loss Landscape of Deep Neural Networks
Embedding Principle of Loss Landscape of Deep Neural Networks
Yaoyu Zhang
Zhongwang Zhang
Tao Luo
Z. Xu
11
34
0
30 May 2021
Overparameterization of deep ResNet: zero loss and mean-field analysis
Overparameterization of deep ResNet: zero loss and mean-field analysis
Zhiyan Ding
Shi Chen
Qin Li
S. Wright
ODL
25
24
0
30 May 2021
Geometry of the Loss Landscape in Overparameterized Neural Networks:
  Symmetries and Invariances
Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances
Berfin cSimcsek
François Ged
Arthur Jacot
Francesco Spadaro
Clément Hongler
W. Gerstner
Johanni Brea
AI4CE
36
91
0
25 May 2021
Towards Understanding the Condensation of Neural Networks at Initial
  Training
Towards Understanding the Condensation of Neural Networks at Initial Training
Hanxu Zhou
Qixuan Zhou
Tao Luo
Yaoyu Zhang
Z. Xu
MLT
AI4CE
21
26
0
25 May 2021
Frank-Wolfe Methods in Probability Space
Frank-Wolfe Methods in Probability Space
Carson Kent
Jose H. Blanchet
Peter Glynn
9
9
0
11 May 2021
Global Convergence of Three-layer Neural Networks in the Mean Field
  Regime
Global Convergence of Three-layer Neural Networks in the Mean Field Regime
H. Pham
Phan-Minh Nguyen
MLT
AI4CE
41
19
0
11 May 2021
Relative stability toward diffeomorphisms indicates performance in deep
  nets
Relative stability toward diffeomorphisms indicates performance in deep nets
Leonardo Petrini
Alessandro Favero
Mario Geiger
M. Wyart
OOD
38
15
0
06 May 2021
Two-layer neural networks with values in a Banach space
Two-layer neural networks with values in a Banach space
Yury Korolev
29
23
0
05 May 2021
Universal scaling laws in the gradient descent training of neural
  networks
Universal scaling laws in the gradient descent training of neural networks
Maksim Velikanov
Dmitry Yarotsky
46
9
0
02 May 2021
One-pass Stochastic Gradient Descent in Overparametrized Two-layer
  Neural Networks
One-pass Stochastic Gradient Descent in Overparametrized Two-layer Neural Networks
Hanjing Zhu
Hanjing Zhu
MLT
11
3
0
01 May 2021
Previous
123...1056789
Next