ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.09545
  4. Cited By
On the Global Convergence of Gradient Descent for Over-parameterized
  Models using Optimal Transport

On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport

24 May 2018
Lénaïc Chizat
Francis R. Bach
    OT
ArXivPDFHTML

Papers citing "On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport"

50 / 483 papers shown
Title
On the existence of minimizers in shallow residual ReLU neural network
  optimization landscapes
On the existence of minimizers in shallow residual ReLU neural network optimization landscapes
Steffen Dereich
Arnulf Jentzen
Sebastian Kassing
26
6
0
28 Feb 2023
Learning time-scales in two-layers neural networks
Learning time-scales in two-layers neural networks
Raphael Berthier
Andrea Montanari
Kangjie Zhou
38
33
0
28 Feb 2023
SGD learning on neural networks: leap complexity and saddle-to-saddle
  dynamics
SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Emmanuel Abbe
Enric Boix-Adserà
Theodor Misiakiewicz
FedML
MLT
79
73
0
21 Feb 2023
Gradient Flows for Sampling: Mean-Field Models, Gaussian Approximations
  and Affine Invariance
Gradient Flows for Sampling: Mean-Field Models, Gaussian Approximations and Affine Invariance
Yifan Chen
Daniel Zhengyu Huang
Jiaoyang Huang
Sebastian Reich
Andrew M. Stuart
19
17
0
21 Feb 2023
$ω$PAP Spaces: Reasoning Denotationally About Higher-Order,
  Recursive Probabilistic and Differentiable Programs
ωωωPAP Spaces: Reasoning Denotationally About Higher-Order, Recursive Probabilistic and Differentiable Programs
Mathieu Huot
Alexander K. Lew
Vikash K. Mansinghka
S. Staton
17
5
0
21 Feb 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for
  Learning a Single Neuron
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
Weihang Xu
S. Du
37
16
0
20 Feb 2023
Spatially heterogeneous learning by a deep student machine
Spatially heterogeneous learning by a deep student machine
H. Yoshino
16
1
0
15 Feb 2023
The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
Agustinus Kristiadi
Felix Dangel
Philipp Hennig
32
11
0
14 Feb 2023
Stochastic Modified Flows, Mean-Field Limits and Dynamics of Stochastic
  Gradient Descent
Stochastic Modified Flows, Mean-Field Limits and Dynamics of Stochastic Gradient Descent
Benjamin Gess
Sebastian Kassing
Vitalii Konarovskyi
DiffM
32
6
0
14 Feb 2023
Mean Field Optimization Problem Regularized by Fisher Information
Mean Field Optimization Problem Regularized by Fisher Information
J. Claisse
Giovanni Conforti
Zhenjie Ren
Song-bo Wang
16
5
0
12 Feb 2023
From high-dimensional & mean-field dynamics to dimensionless ODEs: A
  unifying approach to SGD in two-layers networks
From high-dimensional & mean-field dynamics to dimensionless ODEs: A unifying approach to SGD in two-layers networks
Luca Arnaboldi
Ludovic Stephan
Florent Krzakala
Bruno Loureiro
MLT
35
31
0
12 Feb 2023
Efficient displacement convex optimization with particle gradient
  descent
Efficient displacement convex optimization with particle gradient descent
Hadi Daneshmand
J. Lee
Chi Jin
24
5
0
09 Feb 2023
Rethinking Gauss-Newton for learning over-parameterized models
Rethinking Gauss-Newton for learning over-parameterized models
Michael Arbel
Romain Menegaux
Pierre Wolinski
AI4CE
22
5
0
06 Feb 2023
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning
François Caron
Fadhel Ayed
Paul Jung
Hoileong Lee
Juho Lee
Hongseok Yang
62
2
0
02 Feb 2023
Simplicity Bias in 1-Hidden Layer Neural Networks
Simplicity Bias in 1-Hidden Layer Neural Networks
Depen Morwani
Jatin Batra
Prateek Jain
Praneeth Netrapalli
21
17
0
01 Feb 2023
Dynamic Flows on Curved Space Generated by Labeled Data
Dynamic Flows on Curved Space Generated by Labeled Data
Xinru Hua
Truyen V. Nguyen
Tam Le
Jose H. Blanchet
Viet Anh Nguyen
38
9
0
31 Jan 2023
Deep networks for system identification: a Survey
Deep networks for system identification: a Survey
G. Pillonetto
Aleksandr Aravkin
Daniel Gedon
L. Ljung
Antônio H. Ribeiro
Thomas B. Schon
OOD
37
35
0
30 Jan 2023
On adversarial robustness and the use of Wasserstein ascent-descent
  dynamics to enforce it
On adversarial robustness and the use of Wasserstein ascent-descent dynamics to enforce it
Camilo A. Garcia Trillos
Nicolas García Trillos
21
5
0
09 Jan 2023
Learning Gaussian Mixtures Using the Wasserstein-Fisher-Rao Gradient
  Flow
Learning Gaussian Mixtures Using the Wasserstein-Fisher-Rao Gradient Flow
Yuling Yan
Kaizheng Wang
Philippe Rigollet
44
20
0
04 Jan 2023
Pruning Before Training May Improve Generalization, Provably
Pruning Before Training May Improve Generalization, Provably
Hongru Yang
Yingbin Liang
Xiaojie Guo
Lingfei Wu
Zhangyang Wang
MLT
21
1
0
01 Jan 2023
Bayesian Interpolation with Deep Linear Networks
Bayesian Interpolation with Deep Linear Networks
Boris Hanin
Alexander Zlokapa
42
25
0
29 Dec 2022
Selected aspects of complex, hypercomplex and fuzzy neural networks
Selected aspects of complex, hypercomplex and fuzzy neural networks
A. Niemczynowicz
R. Kycia
Maciej Jaworski
A. Siemaszko
J. Calabuig
...
Baruch Schneider
Diana Berseghyan
Irina Perfiljeva
V. Novák
Piotr Artiemjew
24
0
0
29 Dec 2022
A Mathematical Framework for Learning Probability Distributions
A Mathematical Framework for Learning Probability Distributions
Hongkang Yang
29
7
0
22 Dec 2022
Two-Scale Gradient Descent Ascent Dynamics Finds Mixed Nash Equilibria
  of Continuous Games: A Mean-Field Perspective
Two-Scale Gradient Descent Ascent Dynamics Finds Mixed Nash Equilibria of Continuous Games: A Mean-Field Perspective
Yulong Lu
MLT
AI4CE
23
22
0
17 Dec 2022
Learning threshold neurons via the "edge of stability"
Learning threshold neurons via the "edge of stability"
Kwangjun Ahn
Sébastien Bubeck
Sinho Chewi
Y. Lee
Felipe Suarez
Yi Zhang
MLT
38
36
0
14 Dec 2022
Uniform-in-time propagation of chaos for mean field Langevin dynamics
Uniform-in-time propagation of chaos for mean field Langevin dynamics
Fan Chen
Zhenjie Ren
Song-bo Wang
43
30
0
06 Dec 2022
Proximal methods for point source localisation
Proximal methods for point source localisation
T. Valkonen
11
5
0
06 Dec 2022
Infinite-width limit of deep linear neural networks
Infinite-width limit of deep linear neural networks
Lénaïc Chizat
Maria Colombo
Xavier Fernández-Real
Alessio Figalli
31
14
0
29 Nov 2022
Unbalanced Optimal Transport, from Theory to Numerics
Unbalanced Optimal Transport, from Theory to Numerics
Thibault Séjourné
Gabriel Peyré
Franccois-Xavier Vialard
OT
25
47
0
16 Nov 2022
On the symmetries in the dynamics of wide two-layer neural networks
On the symmetries in the dynamics of wide two-layer neural networks
Karl Hajjar
Lénaïc Chizat
13
11
0
16 Nov 2022
Spectral Evolution and Invariance in Linear-width Neural Networks
Spectral Evolution and Invariance in Linear-width Neural Networks
Zhichao Wang
A. Engel
Anand D. Sarwate
Ioana Dumitriu
Tony Chiang
40
14
0
11 Nov 2022
Regression as Classification: Influence of Task Formulation on Neural
  Network Features
Regression as Classification: Influence of Task Formulation on Neural Network Features
Lawrence Stewart
Francis R. Bach
Quentin Berthet
Jean-Philippe Vert
27
24
0
10 Nov 2022
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer
  Neural Networks
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
25
5
0
28 Oct 2022
Stochastic Mirror Descent in Average Ensemble Models
Stochastic Mirror Descent in Average Ensemble Models
Taylan Kargin
Fariborz Salehi
B. Hassibi
24
1
0
27 Oct 2022
Proximal Mean Field Learning in Shallow Neural Networks
Proximal Mean Field Learning in Shallow Neural Networks
Alexis M. H. Teter
Iman Nodozi
A. Halder
FedML
43
1
0
25 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets
Global Convergence of SGD On Two Layer Neural Nets
Pulkit Gopalani
Anirbit Mukherjee
26
5
0
20 Oct 2022
Mean-field analysis for heavy ball methods: Dropout-stability,
  connectivity, and global convergence
Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence
Diyuan Wu
Vyacheslav Kungurtsev
Marco Mondelli
23
3
0
13 Oct 2022
Annihilation of Spurious Minima in Two-Layer ReLU Networks
Annihilation of Spurious Minima in Two-Layer ReLU Networks
Yossi Arjevani
M. Field
16
8
0
12 Oct 2022
Meta-Principled Family of Hyperparameter Scaling Strategies
Meta-Principled Family of Hyperparameter Scaling Strategies
Sho Yaida
55
16
0
10 Oct 2022
Analysis of the rate of convergence of an over-parametrized deep neural
  network estimate learned by gradient descent
Analysis of the rate of convergence of an over-parametrized deep neural network estimate learned by gradient descent
Michael Kohler
A. Krzyżak
32
10
0
04 Oct 2022
Plateau in Monotonic Linear Interpolation -- A "Biased" View of Loss
  Landscape for Deep Networks
Plateau in Monotonic Linear Interpolation -- A "Biased" View of Loss Landscape for Deep Networks
Xiang Wang
Annie Wang
Mo Zhou
Rong Ge
MoMe
160
10
0
03 Oct 2022
Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear
  Functions
Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear Functions
Arthur Jacot
36
25
0
29 Sep 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with
  SGD
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD
Alireza Mousavi-Hosseini
Sejun Park
M. Girotti
Ioannis Mitliagkas
Murat A. Erdogdu
MLT
324
48
0
29 Sep 2022
Lazy vs hasty: linearization in deep networks impacts learning schedule
  based on example difficulty
Lazy vs hasty: linearization in deep networks impacts learning schedule based on example difficulty
Thomas George
Guillaume Lajoie
A. Baratin
28
5
0
19 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the
  ugly (initialization)
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization)
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
V. Cevher
39
19
0
15 Sep 2022
Git Re-Basin: Merging Models modulo Permutation Symmetries
Git Re-Basin: Merging Models modulo Permutation Symmetries
Samuel K. Ainsworth
J. Hayase
S. Srinivasa
MoMe
255
314
0
11 Sep 2022
On the universal consistency of an over-parametrized deep neural network
  estimate learned by gradient descent
On the universal consistency of an over-parametrized deep neural network estimate learned by gradient descent
Selina Drews
Michael Kohler
30
13
0
30 Aug 2022
Agnostic Learning of General ReLU Activation Using Gradient Descent
Agnostic Learning of General ReLU Activation Using Gradient Descent
Pranjal Awasthi
Alex K. Tang
Aravindan Vijayaraghavan
MLT
10
7
0
04 Aug 2022
Gradient descent provably escapes saddle points in the training of
  shallow ReLU networks
Gradient descent provably escapes saddle points in the training of shallow ReLU networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
28
5
0
03 Aug 2022
Variational Inference of overparameterized Bayesian Neural Networks: a
  theoretical and empirical study
Variational Inference of overparameterized Bayesian Neural Networks: a theoretical and empirical study
Tom Huix
Szymon Majewski
Alain Durmus
Eric Moulines
Anna Korba
BDL
18
6
0
08 Jul 2022
Previous
12345...8910
Next