ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.09545
  4. Cited By
On the Global Convergence of Gradient Descent for Over-parameterized
  Models using Optimal Transport

On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport

24 May 2018
Lénaïc Chizat
Francis R. Bach
    OT
ArXivPDFHTML

Papers citing "On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport"

50 / 483 papers shown
Title
Central Limit Theorem for Bayesian Neural Network trained with
  Variational Inference
Central Limit Theorem for Bayesian Neural Network trained with Variational Inference
Arnaud Descours
Tom Huix
Arnaud Guillin
Manon Michel
Eric Moulines
Boris Nectoux
33
0
0
10 Jun 2024
Get rich quick: exact solutions reveal how unbalanced initializations
  promote rapid feature learning
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning
D. Kunin
Allan Raventós
Clémentine Dominé
Feng Chen
David Klindt
Andrew M. Saxe
Surya Ganguli
MLT
48
15
0
10 Jun 2024
Error Bounds of Supervised Classification from Information-Theoretic
  Perspective
Error Bounds of Supervised Classification from Information-Theoretic Perspective
Binchuan Qi
Wei Gong
Li Li
34
0
0
07 Jun 2024
Online Learning and Information Exponents: On The Importance of Batch
  size, and Time/Complexity Tradeoffs
Online Learning and Information Exponents: On The Importance of Batch size, and Time/Complexity Tradeoffs
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Bruno Loureiro
Luca Pesce
Ludovic Stephan
47
1
0
04 Jun 2024
Tilting the Odds at the Lottery: the Interplay of Overparameterisation
  and Curricula in Neural Networks
Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks
Stefano Sarao Mannelli
Yaraslau Ivashinka
Andrew M. Saxe
Luca Saglietti
42
2
0
03 Jun 2024
Wasserstein gradient flow for optimal probability measure decomposition
Wasserstein gradient flow for optimal probability measure decomposition
Jiangze Han
Chris Ryan
Xin T. Tong
23
1
0
03 Jun 2024
Symmetries in Overparametrized Neural Networks: A Mean-Field View
Symmetries in Overparametrized Neural Networks: A Mean-Field View
Javier Maass Martínez
Joaquin Fontbona
FedML
MLT
50
2
0
30 May 2024
Diffeomorphic interpolation for efficient persistence-based topological
  optimization
Diffeomorphic interpolation for efficient persistence-based topological optimization
Mathieu Carrière
Marc Theveneau
Théo Lacombe
21
1
0
29 May 2024
Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
Zhenfeng Tu
Santiago Aranguri
Arthur Jacot
31
8
0
27 May 2024
Simplicity Bias of Two-Layer Networks beyond Linearly Separable Data
Simplicity Bias of Two-Layer Networks beyond Linearly Separable Data
Nikita Tsoy
Nikola Konstantinov
37
4
0
27 May 2024
Improved Particle Approximation Error for Mean Field Neural Networks
Improved Particle Approximation Error for Mean Field Neural Networks
Atsushi Nitanda
21
6
0
24 May 2024
Infinite Limits of Multi-head Transformer Dynamics
Infinite Limits of Multi-head Transformer Dynamics
Blake Bordelon
Hamza Tahir Chaudhry
C. Pehlevan
AI4CE
47
9
0
24 May 2024
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Luca Pesce
Ludovic Stephan
70
12
0
24 May 2024
Unified Universality Theorem for Deep and Shallow Joint-Group-Equivariant Machines
Unified Universality Theorem for Deep and Shallow Joint-Group-Equivariant Machines
Sho Sonoda
Yuka Hashimoto
Isao Ishikawa
Masahiro Ikeda
39
0
0
22 May 2024
Initialization is Critical to Whether Transformers Fit Composite Functions by Reasoning or Memorizing
Initialization is Critical to Whether Transformers Fit Composite Functions by Reasoning or Memorizing
Zhongwang Zhang
Pengxiao Lin
Zhiwei Wang
Yaoyu Zhang
Z. Xu
39
3
0
08 May 2024
Convergence analysis of controlled particle systems arising in deep learning: from finite to infinite sample size
Convergence analysis of controlled particle systems arising in deep learning: from finite to infinite sample size
Huafu Liao
Alpár R. Mészáros
Chenchen Mou
Chao Zhou
26
2
0
08 Apr 2024
Mean-field Analysis on Two-layer Neural Networks from a Kernel
  Perspective
Mean-field Analysis on Two-layer Neural Networks from a Kernel Perspective
Shokichi Takakura
Taiji Suzuki
MLT
22
5
0
22 Mar 2024
Understanding the training of infinitely deep and wide ResNets with
  Conditional Optimal Transport
Understanding the training of infinitely deep and wide ResNets with Conditional Optimal Transport
Raphael Barboni
Gabriel Peyré
Franccois-Xavier Vialard
37
3
0
19 Mar 2024
Generalization of Scaled Deep ResNets in the Mean-Field Regime
Generalization of Scaled Deep ResNets in the Mean-Field Regime
Yihang Chen
Fanghui Liu
Yiping Lu
Grigorios G. Chrysos
V. Cevher
41
2
0
14 Mar 2024
Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations
Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations
Akshay Kumar
Jarvis Haupt
ODL
44
3
0
12 Mar 2024
Analysis of Kernel Mirror Prox for Measure Optimization
Analysis of Kernel Mirror Prox for Measure Optimization
Pavel Dvurechensky
Jia Jie Zhu
31
2
0
29 Feb 2024
Learning Associative Memories with Gradient Descent
Learning Associative Memories with Gradient Descent
Vivien A. Cabannes
Berfin Simsek
A. Bietti
38
6
0
28 Feb 2024
A unified Fourier slice method to derive ridgelet transform for a
  variety of depth-2 neural networks
A unified Fourier slice method to derive ridgelet transform for a variety of depth-2 neural networks
Sho Sonoda
Isao Ishikawa
Masahiro Ikeda
49
4
0
25 Feb 2024
On the dynamics of three-layer neural networks: initial condensation
On the dynamics of three-layer neural networks: initial condensation
Zheng-an Chen
Tao Luo
MLT
AI4CE
22
3
0
25 Feb 2024
Directional Convergence Near Small Initializations and Saddles in
  Two-Homogeneous Neural Networks
Directional Convergence Near Small Initializations and Saddles in Two-Homogeneous Neural Networks
Akshay Kumar
Jarvis Haupt
ODL
30
7
0
14 Feb 2024
Depth Separation in Norm-Bounded Infinite-Width Neural Networks
Depth Separation in Norm-Bounded Infinite-Width Neural Networks
Suzanna Parkinson
Greg Ongie
Rebecca Willett
Ohad Shamir
Nathan Srebro
MDE
50
2
0
13 Feb 2024
Mirror Descent-Ascent for mean-field min-max problems
Mirror Descent-Ascent for mean-field min-max problems
Razvan-Andrei Lascu
Mateusz B. Majka
Lukasz Szpruch
27
1
0
12 Feb 2024
Sampling from the Mean-Field Stationary Distribution
Sampling from the Mean-Field Stationary Distribution
Yunbum Kook
Matthew Shunshi Zhang
Sinho Chewi
Murat A. Erdogdu
Mufan Bill Li
64
7
0
12 Feb 2024
Generalization Error of Graph Neural Networks in the Mean-field Regime
Generalization Error of Graph Neural Networks in the Mean-field Regime
Gholamali Aminian
Yixuan He
Gesine Reinert
Lukasz Szpruch
Samuel N. Cohen
48
3
0
10 Feb 2024
Asymptotics of feature learning in two-layer networks after one
  gradient-step
Asymptotics of feature learning in two-layer networks after one gradient-step
Hugo Cui
Luca Pesce
Yatin Dandi
Florent Krzakala
Yue M. Lu
Lenka Zdeborová
Bruno Loureiro
MLT
58
16
0
07 Feb 2024
The Benefits of Reusing Batches for Gradient Descent in Two-Layer
  Networks: Breaking the Curse of Information and Leap Exponents
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
Yatin Dandi
Emanuele Troiani
Luca Arnaboldi
Luca Pesce
Lenka Zdeborová
Florent Krzakala
MLT
66
26
0
05 Feb 2024
Towards Understanding the Word Sensitivity of Attention Layers: A Study
  via Random Features
Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features
Simone Bombari
Marco Mondelli
39
3
0
05 Feb 2024
$C^*$-Algebraic Machine Learning: Moving in a New Direction
C∗C^*C∗-Algebraic Machine Learning: Moving in a New Direction
Yuka Hashimoto
Masahiro Ikeda
Hachem Kadri
35
2
0
04 Feb 2024
Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field
  Dynamics on the Attention Landscape
Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape
Juno Kim
Taiji Suzuki
18
18
0
02 Feb 2024
Privacy-preserving data release leveraging optimal transport and
  particle gradient descent
Privacy-preserving data release leveraging optimal transport and particle gradient descent
Konstantin Donhauser
Javier Abad
Neha Hulkund
Fanny Yang
41
4
0
31 Jan 2024
A Survey on Statistical Theory of Deep Learning: Approximation, Training
  Dynamics, and Generative Models
A Survey on Statistical Theory of Deep Learning: Approximation, Training Dynamics, and Generative Models
Namjoon Suh
Guang Cheng
MedIm
30
12
0
14 Jan 2024
Hidden Minima in Two-Layer ReLU Networks
Hidden Minima in Two-Layer ReLU Networks
Yossi Arjevani
32
3
0
28 Dec 2023
Mean-field underdamped Langevin dynamics and its spacetime
  discretization
Mean-field underdamped Langevin dynamics and its spacetime discretization
Qiang Fu
Ashia Wilson
40
4
0
26 Dec 2023
A note on regularised NTK dynamics with an application to PAC-Bayesian
  training
A note on regularised NTK dynamics with an application to PAC-Bayesian training
Eugenio Clerico
Benjamin Guedj
33
0
0
20 Dec 2023
Enhancing Neural Training via a Correlated Dynamics Model
Enhancing Neural Training via a Correlated Dynamics Model
Jonathan Brokman
Roy Betser
Rotem Turjeman
Tom Berkov
I. Cohen
Guy Gilboa
24
3
0
20 Dec 2023
A mathematical perspective on Transformers
A mathematical perspective on Transformers
Borjan Geshkovski
Cyril Letrouit
Yury Polyanskiy
Philippe Rigollet
EDL
AI4CE
42
36
0
17 Dec 2023
FastPart: Over-Parameterized Stochastic Gradient Descent for Sparse
  optimisation on Measures
FastPart: Over-Parameterized Stochastic Gradient Descent for Sparse optimisation on Measures
Yohann De Castro
S. Gadat
C. Marteau
13
0
0
10 Dec 2023
Learning a Sparse Representation of Barron Functions with the Inverse
  Scale Space Flow
Learning a Sparse Representation of Barron Functions with the Inverse Scale Space Flow
T. J. Heeringa
Tim Roith
Christoph Brune
Martin Burger
18
0
0
05 Dec 2023
Symmetric Mean-field Langevin Dynamics for Distributional Minimax
  Problems
Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems
Juno Kim
Kakei Yamamoto
Kazusato Oko
Zhuoran Yang
Taiji Suzuki
34
9
0
02 Dec 2023
The Feature Speed Formula: a flexible approach to scale hyper-parameters
  of deep neural networks
The Feature Speed Formula: a flexible approach to scale hyper-parameters of deep neural networks
Lénaic Chizat
Praneeth Netrapalli
20
4
0
30 Nov 2023
A convergence result of a continuous model of deep learning via
  Łojasiewicz--Simon inequality
A convergence result of a continuous model of deep learning via Łojasiewicz--Simon inequality
Noboru Isobe
16
2
0
26 Nov 2023
Eliminating Domain Bias for Federated Learning in Representation Space
Eliminating Domain Bias for Federated Learning in Representation Space
Jianqing Zhang
Yang Hua
Jian Cao
Hao Wang
Tao Song
Zhengui Xue
Ruhui Ma
Haibing Guan
FedML
73
33
0
25 Nov 2023
Minimum norm interpolation by perceptra: Explicit regularization and
  implicit bias
Minimum norm interpolation by perceptra: Explicit regularization and implicit bias
Jiyoung Park
Ian Pelakh
Stephan Wojtowytsch
45
2
0
10 Nov 2023
On the Impact of Overparameterization on the Training of a Shallow
  Neural Network in High Dimensions
On the Impact of Overparameterization on the Training of a Shallow Neural Network in High Dimensions
Simon Martin
Francis Bach
Giulio Biroli
23
9
0
07 Nov 2023
Minimizing Convex Functionals over Space of Probability Measures via KL
  Divergence Gradient Flow
Minimizing Convex Functionals over Space of Probability Measures via KL Divergence Gradient Flow
Rentian Yao
Linjun Huang
Yun Yang
21
3
0
01 Nov 2023
Previous
12345...8910
Next