ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.09545
  4. Cited By
On the Global Convergence of Gradient Descent for Over-parameterized
  Models using Optimal Transport

On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport

24 May 2018
Lénaïc Chizat
Francis R. Bach
    OT
ArXivPDFHTML

Papers citing "On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport"

50 / 483 papers shown
Title
Proving Linear Mode Connectivity of Neural Networks via Optimal
  Transport
Proving Linear Mode Connectivity of Neural Networks via Optimal Transport
Damien Ferbach
Baptiste Goujaud
Gauthier Gidel
Aymeric Dieuleveut
MoMe
21
16
0
29 Oct 2023
When can transformers reason with abstract symbols?
When can transformers reason with abstract symbols?
Enric Boix-Adserà
Omid Saremi
Emmanuel Abbe
Samy Bengio
Etai Littwin
Josh Susskind
LRM
NAI
31
12
0
15 Oct 2023
Accelerating optimization over the space of probability measures
Accelerating optimization over the space of probability measures
Shi Chen
Wenxuan Wu
Yuhang Yao
Stephen J. Wright
29
4
0
06 Oct 2023
Sampling via Gradient Flows in the Space of Probability Measures
Sampling via Gradient Flows in the Space of Probability Measures
Yifan Chen
Daniel Zhengyu Huang
Jiaoyang Huang
Sebastian Reich
Andrew M. Stuart
30
13
0
05 Oct 2023
Joint Group Invariant Functions on Data-Parameter Domain Induce
  Universal Neural Networks
Joint Group Invariant Functions on Data-Parameter Domain Induce Universal Neural Networks
Sho Sonoda
Hideyuki Ishi
Isao Ishikawa
Masahiro Ikeda
16
4
0
05 Oct 2023
Deep Ridgelet Transform: Voice with Koopman Operator Proves Universality
  of Formal Deep Networks
Deep Ridgelet Transform: Voice with Koopman Operator Proves Universality of Formal Deep Networks
Sho Sonoda
Yuka Hashimoto
Isao Ishikawa
Masahiro Ikeda
27
3
0
05 Oct 2023
Tensor Programs VI: Feature Learning in Infinite-Depth Neural Networks
Tensor Programs VI: Feature Learning in Infinite-Depth Neural Networks
Greg Yang
Dingli Yu
Chen Zhu
Soufiane Hayou
MLT
8
27
0
03 Oct 2023
Spectral Neural Networks: Approximation Theory and Optimization
  Landscape
Spectral Neural Networks: Approximation Theory and Optimization Landscape
Chenghui Li
Rishi Sonthalia
Nicolas García Trillos
27
1
0
01 Oct 2023
JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and
  Attention
JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention
Yuandong Tian
Yiping Wang
Zhenyu (Allen) Zhang
Beidi Chen
Simon S. Du
34
35
0
01 Oct 2023
Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and
  Scaling Limit
Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit
Blake Bordelon
Lorenzo Noci
Mufan Bill Li
Boris Hanin
C. Pehlevan
32
23
0
28 Sep 2023
Beyond Log-Concavity: Theory and Algorithm for Sum-Log-Concave
  Optimization
Beyond Log-Concavity: Theory and Algorithm for Sum-Log-Concave Optimization
Mastane Achab
33
1
0
26 Sep 2023
Global Convergence of SGD For Logistic Loss on Two Layer Neural Nets
Global Convergence of SGD For Logistic Loss on Two Layer Neural Nets
Pulkit Gopalani
Samyak Jha
Anirbit Mukherjee
19
2
0
17 Sep 2023
Gradient-Based Feature Learning under Structured Data
Gradient-Based Feature Learning under Structured Data
Alireza Mousavi-Hosseini
Denny Wu
Taiji Suzuki
Murat A. Erdogdu
MLT
37
18
0
07 Sep 2023
Kernel Limit of Recurrent Neural Networks Trained on Ergodic Data
  Sequences
Kernel Limit of Recurrent Neural Networks Trained on Ergodic Data Sequences
Samuel Chun-Hei Lam
Justin A. Sirignano
K. Spiliopoulos
30
2
0
28 Aug 2023
Six Lectures on Linearized Neural Networks
Six Lectures on Linearized Neural Networks
Theodor Misiakiewicz
Andrea Montanari
39
12
0
25 Aug 2023
Nonlinear Hamiltonian Monte Carlo & its Particle Approximation
Nonlinear Hamiltonian Monte Carlo & its Particle Approximation
Nawaf Bou-Rabee
Katharina Schuh
23
7
0
22 Aug 2023
Local Kernel Renormalization as a mechanism for feature learning in
  overparametrized Convolutional Neural Networks
Local Kernel Renormalization as a mechanism for feature learning in overparametrized Convolutional Neural Networks
R. Aiudi
R. Pacelli
A. Vezzani
R. Burioni
P. Rotondo
MLT
21
15
0
21 Jul 2023
What can a Single Attention Layer Learn? A Study Through the Random
  Features Lens
What can a Single Attention Layer Learn? A Study Through the Random Features Lens
Hengyu Fu
Tianyu Guo
Yu Bai
Song Mei
MLT
35
22
0
21 Jul 2023
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural
  Networks
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks
Liam Collins
Hamed Hassani
Mahdi Soltanolkotabi
Aryan Mokhtari
Sanjay Shakkottai
39
10
0
13 Jul 2023
Quantitative CLTs in Deep Neural Networks
Quantitative CLTs in Deep Neural Networks
Stefano Favaro
Boris Hanin
Domenico Marinucci
I. Nourdin
G. Peccati
BDL
28
11
0
12 Jul 2023
Fundamental limits of overparametrized shallow neural networks for
  supervised learning
Fundamental limits of overparametrized shallow neural networks for supervised learning
Francesco Camilli
D. Tieplova
Jean Barbier
35
9
0
11 Jul 2023
Law of Large Numbers for Bayesian two-layer Neural Network trained with
  Variational Inference
Law of Large Numbers for Bayesian two-layer Neural Network trained with Variational Inference
Arnaud Descours
Tom Huix
Arnaud Guillin
Manon Michel
Eric Moulines
Boris Nectoux
BDL
32
1
0
10 Jul 2023
Neural Hilbert Ladders: Multi-Layer Neural Networks in Function Space
Neural Hilbert Ladders: Multi-Layer Neural Networks in Function Space
Zhengdao Chen
41
1
0
03 Jul 2023
The Shaped Transformer: Attention Models in the Infinite Depth-and-Width
  Limit
The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit
Lorenzo Noci
Chuning Li
Mufan Bill Li
Bobby He
Thomas Hofmann
Chris J. Maddison
Daniel M. Roy
33
29
0
30 Jun 2023
The RL Perceptron: Generalisation Dynamics of Policy Learning in High
  Dimensions
The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions
Nishil Patel
Sebastian Lee
Stefano Sarao Mannelli
Sebastian Goldt
Adrew Saxe
OffRL
28
3
0
17 Jun 2023
Gradient is All You Need?
Gradient is All You Need?
Konstantin Riedl
T. Klock
Carina Geldhauser
M. Fornasier
27
6
0
16 Jun 2023
Implicit Compressibility of Overparametrized Neural Networks Trained
  with Heavy-Tailed SGD
Implicit Compressibility of Overparametrized Neural Networks Trained with Heavy-Tailed SGD
Yijun Wan
Melih Barsbey
A. Zaidi
Umut Simsekli
30
1
0
13 Jun 2023
Convergence of mean-field Langevin dynamics: Time and space
  discretization, stochastic gradient, and variance reduction
Convergence of mean-field Langevin dynamics: Time and space discretization, stochastic gradient, and variance reduction
Taiji Suzuki
Denny Wu
Atsushi Nitanda
32
16
0
12 Jun 2023
How Two-Layer Neural Networks Learn, One (Giant) Step at a Time
How Two-Layer Neural Networks Learn, One (Giant) Step at a Time
Yatin Dandi
Florent Krzakala
Bruno Loureiro
Luca Pesce
Ludovic Stephan
MLT
34
26
0
29 May 2023
A Rainbow in Deep Network Black Boxes
A Rainbow in Deep Network Black Boxes
Florentin Guth
Brice Ménard
G. Rochette
S. Mallat
24
10
0
29 May 2023
Escaping mediocrity: how two-layer networks learn hard generalized
  linear models with SGD
Escaping mediocrity: how two-layer networks learn hard generalized linear models with SGD
Luca Arnaboldi
Florent Krzakala
Bruno Loureiro
Ludovic Stephan
MLT
33
3
0
29 May 2023
Feature-Learning Networks Are Consistent Across Widths At Realistic
  Scales
Feature-Learning Networks Are Consistent Across Widths At Realistic Scales
Nikhil Vyas
Alexander B. Atanasov
Blake Bordelon
Depen Morwani
Sabarish Sainathan
C. Pehlevan
26
22
0
28 May 2023
Generalization Guarantees of Gradient Descent for Multi-Layer Neural
  Networks
Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks
Puyu Wang
Yunwen Lei
Di Wang
Yiming Ying
Ding-Xuan Zhou
MLT
29
3
0
26 May 2023
Scan and Snap: Understanding Training Dynamics and Token Composition in
  1-layer Transformer
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
Yuandong Tian
Yiping Wang
Beidi Chen
S. Du
MLT
26
70
0
25 May 2023
Tight conditions for when the NTK approximation is valid
Tight conditions for when the NTK approximation is valid
Enric Boix-Adserà
Etai Littwin
30
0
0
22 May 2023
Understanding the Initial Condensation of Convolutional Neural Networks
Understanding the Initial Condensation of Convolutional Neural Networks
Zhangchen Zhou
Hanxu Zhou
Yuqing Li
Zhi-Qin John Xu
MLT
AI4CE
26
5
0
17 May 2023
Scalable Optimal Transport Methods in Machine Learning: A Contemporary
  Survey
Scalable Optimal Transport Methods in Machine Learning: A Contemporary Survey
Abdelwahed Khamis
Russell Tsuchida
Mohamed Tarek
V. Rolland
Lars Petersson
OT
45
12
0
08 May 2023
Expand-and-Cluster: Parameter Recovery of Neural Networks
Expand-and-Cluster: Parameter Recovery of Neural Networks
Flavio Martinelli
Berfin Simsek
W. Gerstner
Johanni Brea
26
4
0
25 Apr 2023
Leveraging the two timescale regime to demonstrate convergence of neural
  networks
Leveraging the two timescale regime to demonstrate convergence of neural networks
P. Marion
Raphael Berthier
36
5
0
19 Apr 2023
Convergence of stochastic gradient descent under a local Lojasiewicz
  condition for deep neural networks
Convergence of stochastic gradient descent under a local Lojasiewicz condition for deep neural networks
Jing An
Jianfeng Lu
16
4
0
18 Apr 2023
Performative Prediction with Neural Networks
Performative Prediction with Neural Networks
Mehrnaz Mofakhami
Ioannis Mitliagkas
Gauthier Gidel
40
16
0
14 Apr 2023
Full Gradient Deep Reinforcement Learning for Average-Reward Criterion
Full Gradient Deep Reinforcement Learning for Average-Reward Criterion
Tejas Pagare
Vivek Borkar
Konstantin Avrachenkov
24
4
0
07 Apr 2023
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean
  Field Neural Networks
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks
Blake Bordelon
C. Pehlevan
MLT
38
29
0
06 Apr 2023
Depth Separation with Multilayer Mean-Field Networks
Depth Separation with Multilayer Mean-Field Networks
Y. Ren
Mo Zhou
Rong Ge
OOD
19
3
0
03 Apr 2023
High-dimensional scaling limits and fluctuations of online least-squares
  SGD with smooth covariance
High-dimensional scaling limits and fluctuations of online least-squares SGD with smooth covariance
Krishnakumar Balasubramanian
Promit Ghosal
Ye He
38
5
0
03 Apr 2023
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and
  Global Optimality
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and Global Optimality
François Ged
M. H. Veiga
28
0
0
22 Mar 2023
Global Optimality of Elman-type RNN in the Mean-Field Regime
Global Optimality of Elman-type RNN in the Mean-Field Regime
Andrea Agazzi
Jian-Xiong Lu
Sayan Mukherjee
MLT
34
1
0
12 Mar 2023
Phase Diagram of Initial Condensation for Two-layer Neural Networks
Phase Diagram of Initial Condensation for Two-layer Neural Networks
Zheng Chen
Yuqing Li
Tao Luo
Zhaoguang Zhou
Z. Xu
MLT
AI4CE
49
8
0
12 Mar 2023
On the Implicit Bias of Linear Equivariant Steerable Networks
On the Implicit Bias of Linear Equivariant Steerable Networks
Ziyu Chen
Wei-wei Zhu
29
3
0
07 Mar 2023
Primal and Dual Analysis of Entropic Fictitious Play for Finite-sum
  Problems
Primal and Dual Analysis of Entropic Fictitious Play for Finite-sum Problems
Atsushi Nitanda
Kazusato Oko
Denny Wu
Nobuhito Takenouchi
Taiji Suzuki
32
3
0
06 Mar 2023
Previous
123456...8910
Next