ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.09545
  4. Cited By
On the Global Convergence of Gradient Descent for Over-parameterized
  Models using Optimal Transport

On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport

24 May 2018
Lénaïc Chizat
Francis R. Bach
    OT
ArXivPDFHTML

Papers citing "On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport"

50 / 483 papers shown
Title
Convergence of Time-Averaged Mean Field Gradient Descent Dynamics for Continuous Multi-Player Zero-Sum Games
Convergence of Time-Averaged Mean Field Gradient Descent Dynamics for Continuous Multi-Player Zero-Sum Games
Yulong Lu
Pierre Monmarché
MLT
34
0
0
12 May 2025
Ergodic Generative Flows
Ergodic Generative Flows
Leo Maxime Brunswic
Mateo Clemente
Rui Heng Yang
Adam Sigal
Amir Rasouli
Yinchuan Li
42
0
0
06 May 2025
Information-theoretic reduction of deep neural networks to linear models in the overparametrized proportional regime
Information-theoretic reduction of deep neural networks to linear models in the overparametrized proportional regime
Francesco Camilli
D. Tieplova
Eleonora Bergamin
Jean Barbier
109
0
0
06 May 2025
Mirror Mean-Field Langevin Dynamics
Mirror Mean-Field Langevin Dynamics
Anming Gu
Juno Kim
31
0
0
05 May 2025
Don't be lazy: CompleteP enables compute-efficient deep transformers
Don't be lazy: CompleteP enables compute-efficient deep transformers
Nolan Dey
Bin Claire Zhang
Lorenzo Noci
Mufan Bill Li
Blake Bordelon
Shane Bergsma
C. Pehlevan
Boris Hanin
Joel Hestness
39
0
0
02 May 2025
Ultra-fast feature learning for the training of two-layer neural networks in the two-timescale regime
Ultra-fast feature learning for the training of two-layer neural networks in the two-timescale regime
Raphael Barboni
Gabriel Peyré
François-Xavier Vialard
MLT
34
0
0
25 Apr 2025
An overview of condensation phenomenon in deep learning
An overview of condensation phenomenon in deep learning
Zhi-Qin John Xu
Yaoyu Zhang
Zhangchen Zhou
AI4CE
29
0
0
13 Apr 2025
Statistically guided deep learning
Statistically guided deep learning
Michael Kohler
A. Krzyżak
ODL
BDL
76
0
0
11 Apr 2025
Fractal and Regular Geometry of Deep Neural Networks
Fractal and Regular Geometry of Deep Neural Networks
Simmaco Di Lillo
Domenico Marinucci
Michele Salvi
S. Vigogna
MDE
AI4CE
36
0
0
08 Apr 2025
Survey on Algorithms for multi-index models
Survey on Algorithms for multi-index models
Joan Bruna
Daniel Hsu
31
0
0
07 Apr 2025
Towards Understanding the Optimization Mechanisms in Deep Learning
Towards Understanding the Optimization Mechanisms in Deep Learning
Binchuan Qi
Wei Gong
Li Li
47
0
0
29 Mar 2025
Beyond Propagation of Chaos: A Stochastic Algorithm for Mean Field Optimization
Beyond Propagation of Chaos: A Stochastic Algorithm for Mean Field Optimization
Chandan Tankala
Dheeraj M. Nagaraj
Anant Raj
44
0
0
17 Mar 2025
The Spectral Bias of Shallow Neural Network Learning is Shaped by the Choice of Non-linearity
Justin Sahs
Ryan Pyle
Fabio Anselmi
Ankit B. Patel
60
0
0
13 Mar 2025
Global Convergence and Rich Feature Learning in LLL-Layer Infinite-Width Neural Networks under μμμP Parametrization
Zixiang Chen
Greg Yang
Qingyue Zhao
Q. Gu
MLT
50
0
0
12 Mar 2025
A Theory of Initialisation's Impact on Specialisation
Devon Jarvis
Sebastian Lee
Clémentine Dominé
Andrew M. Saxe
Stefano Sarao Mannelli
CLL
72
2
0
04 Mar 2025
DDEQs: Distributional Deep Equilibrium Models through Wasserstein Gradient Flows
DDEQs: Distributional Deep Equilibrium Models through Wasserstein Gradient Flows
Jonathan Geuter
Clément Bonet
Anna Korba
David Alvarez-Melis
61
0
0
03 Mar 2025
Convergence of Shallow ReLU Networks on Weakly Interacting Data
Convergence of Shallow ReLU Networks on Weakly Interacting Data
Léo Dana
Francis R. Bach
Loucas Pillaud-Vivien
MLT
52
1
0
24 Feb 2025
A Gap Between the Gaussian RKHS and Neural Networks: An Infinite-Center Asymptotic Analysis
A Gap Between the Gaussian RKHS and Neural Networks: An Infinite-Center Asymptotic Analysis
Akash Kumar
Rahul Parhi
Mikhail Belkin
46
0
0
22 Feb 2025
Properties of Wasserstein Gradient Flows for the Sliced-Wasserstein Distance
Properties of Wasserstein Gradient Flows for the Sliced-Wasserstein Distance
Christophe Vauthier
Quentin Mérigot
Anna Korba
45
0
0
10 Feb 2025
Propagation of Chaos for Mean-Field Langevin Dynamics and its Application to Model Ensemble
Atsushi Nitanda
Anzelle Lee
Damian Tan Xing Kai
Mizuki Sakaguchi
Taiji Suzuki
AI4CE
61
1
0
09 Feb 2025
Curse of Dimensionality in Neural Network Optimization
Sanghoon Na
Haizhao Yang
56
0
0
07 Feb 2025
Geometry and Optimization of Shallow Polynomial Networks
Geometry and Optimization of Shallow Polynomial Networks
Yossi Arjevani
Joan Bruna
Joe Kileel
Elzbieta Polak
Matthew Trager
36
1
0
10 Jan 2025
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input
Ziang Chen
Rong Ge
MLT
61
1
0
10 Jan 2025
Non-geodesically-convex optimization in the Wasserstein space
Non-geodesically-convex optimization in the Wasserstein space
Hoang Phuc Hau Luu
Hanlin Yu
Bernardo Williams
Petrus Mikkola
Marcelo Hartmann
Kai Puolamaki
Arto Klami
53
2
0
08 Jan 2025
Linear convergence of proximal descent schemes on the Wasserstein space
Linear convergence of proximal descent schemes on the Wasserstein space
Razvan-Andrei Lascu
Mateusz B. Majka
David Siska
Łukasz Szpruch
74
1
0
22 Nov 2024
Proportional infinite-width infinite-depth limit for deep linear neural
  networks
Proportional infinite-width infinite-depth limit for deep linear neural networks
Federico Bassetti
Lucia Ladelli
P. Rotondo
75
1
0
22 Nov 2024
Emergence of meta-stable clustering in mean-field transformer models
Emergence of meta-stable clustering in mean-field transformer models
Giuseppe Bruno
Federico Pasqualotto
Andrea Agazzi
45
6
0
30 Oct 2024
A Random Matrix Theory Perspective on the Spectrum of Learned Features
  and Asymptotic Generalization Capabilities
A Random Matrix Theory Perspective on the Spectrum of Learned Features and Asymptotic Generalization Capabilities
Yatin Dandi
Luca Pesce
Hugo Cui
Florent Krzakala
Yue M. Lu
Bruno Loureiro
MLT
37
1
0
24 Oct 2024
Robust Feature Learning for Multi-Index Models in High Dimensions
Robust Feature Learning for Multi-Index Models in High Dimensions
Alireza Mousavi-Hosseini
Adel Javanmard
Murat A. Erdogdu
OOD
AAML
44
1
0
21 Oct 2024
A Lipschitz spaces view of infinitely wide shallow neural networks
A Lipschitz spaces view of infinitely wide shallow neural networks
Francesca Bartolucci
Marcello Carioni
José A. Iglesias
Yury Korolev
Emanuele Naldi
S. Vigogna
23
0
0
18 Oct 2024
Loss Landscape Characterization of Neural Networks without
  Over-Parametrization
Loss Landscape Characterization of Neural Networks without Over-Parametrization
Rustem Islamov
Niccolò Ajroldi
Antonio Orvieto
Aurelien Lucchi
35
4
0
16 Oct 2024
Shallow diffusion networks provably learn hidden low-dimensional
  structure
Shallow diffusion networks provably learn hidden low-dimensional structure
Nicholas M. Boffi
Arthur Jacot
Stephen Tu
Ingvar M. Ziemann
DiffM
36
1
0
15 Oct 2024
Kinetic interacting particle system: parameter estimation from complete
  and partial discrete observations
Kinetic interacting particle system: parameter estimation from complete and partial discrete observations
Chiara Amorino
Vytaut.e Pilipauskait.e
21
1
0
14 Oct 2024
Extended convexity and smoothness and their applications in deep learning
Extended convexity and smoothness and their applications in deep learning
Binchuan Qi
Wei Gong
Li Li
61
0
0
08 Oct 2024
The Optimization Landscape of SGD Across the Feature Learning Strength
The Optimization Landscape of SGD Across the Feature Learning Strength
Alexander B. Atanasov
Alexandru Meterez
James B. Simon
C. Pehlevan
43
2
0
06 Oct 2024
LoGra-Med: Long Context Multi-Graph Alignment for Medical
  Vision-Language Model
LoGra-Med: Long Context Multi-Graph Alignment for Medical Vision-Language Model
Duy M. H. Nguyen
N. T. Diep
Trung Q. Nguyen
Hoang-Bao Le
Tai Nguyen
...
Pengtao Xie
Roger Wattenhofer
James Zhou
Daniel Sonntag
Mathias Niepert
VLM
55
3
0
03 Oct 2024
Simplicity bias and optimization threshold in two-layer ReLU networks
Simplicity bias and optimization threshold in two-layer ReLU networks
Etienne Boursier
Nicolas Flammarion
31
2
0
03 Oct 2024
Nonuniform random feature models using derivative information
Nonuniform random feature models using derivative information
Konstantin Pieper
Zezhong Zhang
Guannan Zhang
14
2
0
03 Oct 2024
Optimal Protocols for Continual Learning via Statistical Physics and Control Theory
Optimal Protocols for Continual Learning via Statistical Physics and Control Theory
Francesco Mori
Stefano Sarao Mannelli
Francesca Mignacco
36
3
0
26 Sep 2024
Optimal sequencing depth for single-cell RNA-sequencing in Wasserstein
  space
Optimal sequencing depth for single-cell RNA-sequencing in Wasserstein space
Jakwang Kim
Sharvaj Kubal
Geoffrey Schiebinger
24
1
0
22 Sep 2024
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
Clémentine Dominé
Nicolas Anguita
A. Proca
Lukas Braun
D. Kunin
P. Mediano
Andrew M. Saxe
38
3
0
22 Sep 2024
Graph Classification with GNNs: Optimisation, Representation and
  Inductive Bias
Graph Classification with GNNs: Optimisation, Representation and Inductive Bias
P. Krishna Kumar a
H. G. Ramaswamy
29
0
0
17 Aug 2024
Absence of Closed-Form Descriptions for Gradient Flow in Two-Layer
  Narrow Networks
Absence of Closed-Form Descriptions for Gradient Flow in Two-Layer Narrow Networks
Yeachan Park
AI4CE
25
0
0
15 Aug 2024
Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics
Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics
Alireza Mousavi-Hosseini
Denny Wu
Murat A. Erdogdu
MLT
AI4CE
35
6
0
14 Aug 2024
A spring-block theory of feature learning in deep neural networks
A spring-block theory of feature learning in deep neural networks
Chengzhi Shi
Liming Pan
Ivan Dokmanić
AI4CE
40
1
0
28 Jul 2024
On the Complexity of Learning Sparse Functions with Statistical and
  Gradient Queries
On the Complexity of Learning Sparse Functions with Statistical and Gradient Queries
Nirmit Joshi
Theodor Misiakiewicz
Nathan Srebro
26
6
0
08 Jul 2024
How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning
How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning
Arthur Jacot
Seok Hoan Choi
Yuxiao Wen
AI4CE
91
2
0
08 Jul 2024
Prospective Messaging: Learning in Networks with Communication Delays
Prospective Messaging: Learning in Networks with Communication Delays
Ryan Fayyazi
Christian Weilbach
Frank D. Wood
25
0
0
07 Jul 2024
Precise analysis of ridge interpolators under heavy correlations -- a
  Random Duality Theory view
Precise analysis of ridge interpolators under heavy correlations -- a Random Duality Theory view
Mihailo Stojnic
27
1
0
13 Jun 2024
Ridge interpolators in correlated factor regression models -- exact risk
  analysis
Ridge interpolators in correlated factor regression models -- exact risk analysis
Mihailo Stojnic
20
1
0
13 Jun 2024
1234...8910
Next