Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1805.09545
Cited By
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport
24 May 2018
Lénaïc Chizat
Francis R. Bach
OT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport"
50 / 483 papers shown
Title
Central Limit Theorem for Bayesian Neural Network trained with Variational Inference
Arnaud Descours
Tom Huix
Arnaud Guillin
Manon Michel
Eric Moulines
Boris Nectoux
33
0
0
10 Jun 2024
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning
D. Kunin
Allan Raventós
Clémentine Dominé
Feng Chen
David Klindt
Andrew M. Saxe
Surya Ganguli
MLT
48
15
0
10 Jun 2024
Error Bounds of Supervised Classification from Information-Theoretic Perspective
Binchuan Qi
Wei Gong
Li Li
34
0
0
07 Jun 2024
Online Learning and Information Exponents: On The Importance of Batch size, and Time/Complexity Tradeoffs
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Bruno Loureiro
Luca Pesce
Ludovic Stephan
47
1
0
04 Jun 2024
Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks
Stefano Sarao Mannelli
Yaraslau Ivashinka
Andrew M. Saxe
Luca Saglietti
42
2
0
03 Jun 2024
Wasserstein gradient flow for optimal probability measure decomposition
Jiangze Han
Chris Ryan
Xin T. Tong
23
1
0
03 Jun 2024
Symmetries in Overparametrized Neural Networks: A Mean-Field View
Javier Maass Martínez
Joaquin Fontbona
FedML
MLT
50
2
0
30 May 2024
Diffeomorphic interpolation for efficient persistence-based topological optimization
Mathieu Carrière
Marc Theveneau
Théo Lacombe
21
1
0
29 May 2024
Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
Zhenfeng Tu
Santiago Aranguri
Arthur Jacot
31
8
0
27 May 2024
Simplicity Bias of Two-Layer Networks beyond Linearly Separable Data
Nikita Tsoy
Nikola Konstantinov
37
4
0
27 May 2024
Improved Particle Approximation Error for Mean Field Neural Networks
Atsushi Nitanda
21
6
0
24 May 2024
Infinite Limits of Multi-head Transformer Dynamics
Blake Bordelon
Hamza Tahir Chaudhry
C. Pehlevan
AI4CE
47
9
0
24 May 2024
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Luca Pesce
Ludovic Stephan
70
12
0
24 May 2024
Unified Universality Theorem for Deep and Shallow Joint-Group-Equivariant Machines
Sho Sonoda
Yuka Hashimoto
Isao Ishikawa
Masahiro Ikeda
39
0
0
22 May 2024
Initialization is Critical to Whether Transformers Fit Composite Functions by Reasoning or Memorizing
Zhongwang Zhang
Pengxiao Lin
Zhiwei Wang
Yaoyu Zhang
Z. Xu
39
3
0
08 May 2024
Convergence analysis of controlled particle systems arising in deep learning: from finite to infinite sample size
Huafu Liao
Alpár R. Mészáros
Chenchen Mou
Chao Zhou
26
2
0
08 Apr 2024
Mean-field Analysis on Two-layer Neural Networks from a Kernel Perspective
Shokichi Takakura
Taiji Suzuki
MLT
22
5
0
22 Mar 2024
Understanding the training of infinitely deep and wide ResNets with Conditional Optimal Transport
Raphael Barboni
Gabriel Peyré
Franccois-Xavier Vialard
37
3
0
19 Mar 2024
Generalization of Scaled Deep ResNets in the Mean-Field Regime
Yihang Chen
Fanghui Liu
Yiping Lu
Grigorios G. Chrysos
V. Cevher
41
2
0
14 Mar 2024
Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations
Akshay Kumar
Jarvis Haupt
ODL
44
3
0
12 Mar 2024
Analysis of Kernel Mirror Prox for Measure Optimization
Pavel Dvurechensky
Jia Jie Zhu
31
2
0
29 Feb 2024
Learning Associative Memories with Gradient Descent
Vivien A. Cabannes
Berfin Simsek
A. Bietti
38
6
0
28 Feb 2024
A unified Fourier slice method to derive ridgelet transform for a variety of depth-2 neural networks
Sho Sonoda
Isao Ishikawa
Masahiro Ikeda
49
4
0
25 Feb 2024
On the dynamics of three-layer neural networks: initial condensation
Zheng-an Chen
Tao Luo
MLT
AI4CE
22
3
0
25 Feb 2024
Directional Convergence Near Small Initializations and Saddles in Two-Homogeneous Neural Networks
Akshay Kumar
Jarvis Haupt
ODL
30
7
0
14 Feb 2024
Depth Separation in Norm-Bounded Infinite-Width Neural Networks
Suzanna Parkinson
Greg Ongie
Rebecca Willett
Ohad Shamir
Nathan Srebro
MDE
50
2
0
13 Feb 2024
Mirror Descent-Ascent for mean-field min-max problems
Razvan-Andrei Lascu
Mateusz B. Majka
Lukasz Szpruch
27
1
0
12 Feb 2024
Sampling from the Mean-Field Stationary Distribution
Yunbum Kook
Matthew Shunshi Zhang
Sinho Chewi
Murat A. Erdogdu
Mufan Bill Li
64
7
0
12 Feb 2024
Generalization Error of Graph Neural Networks in the Mean-field Regime
Gholamali Aminian
Yixuan He
Gesine Reinert
Lukasz Szpruch
Samuel N. Cohen
48
3
0
10 Feb 2024
Asymptotics of feature learning in two-layer networks after one gradient-step
Hugo Cui
Luca Pesce
Yatin Dandi
Florent Krzakala
Yue M. Lu
Lenka Zdeborová
Bruno Loureiro
MLT
58
16
0
07 Feb 2024
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
Yatin Dandi
Emanuele Troiani
Luca Arnaboldi
Luca Pesce
Lenka Zdeborová
Florent Krzakala
MLT
66
26
0
05 Feb 2024
Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features
Simone Bombari
Marco Mondelli
39
3
0
05 Feb 2024
C
∗
C^*
C
∗
-Algebraic Machine Learning: Moving in a New Direction
Yuka Hashimoto
Masahiro Ikeda
Hachem Kadri
35
2
0
04 Feb 2024
Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape
Juno Kim
Taiji Suzuki
18
18
0
02 Feb 2024
Privacy-preserving data release leveraging optimal transport and particle gradient descent
Konstantin Donhauser
Javier Abad
Neha Hulkund
Fanny Yang
41
4
0
31 Jan 2024
A Survey on Statistical Theory of Deep Learning: Approximation, Training Dynamics, and Generative Models
Namjoon Suh
Guang Cheng
MedIm
30
12
0
14 Jan 2024
Hidden Minima in Two-Layer ReLU Networks
Yossi Arjevani
32
3
0
28 Dec 2023
Mean-field underdamped Langevin dynamics and its spacetime discretization
Qiang Fu
Ashia Wilson
40
4
0
26 Dec 2023
A note on regularised NTK dynamics with an application to PAC-Bayesian training
Eugenio Clerico
Benjamin Guedj
33
0
0
20 Dec 2023
Enhancing Neural Training via a Correlated Dynamics Model
Jonathan Brokman
Roy Betser
Rotem Turjeman
Tom Berkov
I. Cohen
Guy Gilboa
24
3
0
20 Dec 2023
A mathematical perspective on Transformers
Borjan Geshkovski
Cyril Letrouit
Yury Polyanskiy
Philippe Rigollet
EDL
AI4CE
42
36
0
17 Dec 2023
FastPart: Over-Parameterized Stochastic Gradient Descent for Sparse optimisation on Measures
Yohann De Castro
S. Gadat
C. Marteau
13
0
0
10 Dec 2023
Learning a Sparse Representation of Barron Functions with the Inverse Scale Space Flow
T. J. Heeringa
Tim Roith
Christoph Brune
Martin Burger
18
0
0
05 Dec 2023
Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems
Juno Kim
Kakei Yamamoto
Kazusato Oko
Zhuoran Yang
Taiji Suzuki
34
9
0
02 Dec 2023
The Feature Speed Formula: a flexible approach to scale hyper-parameters of deep neural networks
Lénaic Chizat
Praneeth Netrapalli
20
4
0
30 Nov 2023
A convergence result of a continuous model of deep learning via Łojasiewicz--Simon inequality
Noboru Isobe
16
2
0
26 Nov 2023
Eliminating Domain Bias for Federated Learning in Representation Space
Jianqing Zhang
Yang Hua
Jian Cao
Hao Wang
Tao Song
Zhengui Xue
Ruhui Ma
Haibing Guan
FedML
73
33
0
25 Nov 2023
Minimum norm interpolation by perceptra: Explicit regularization and implicit bias
Jiyoung Park
Ian Pelakh
Stephan Wojtowytsch
45
2
0
10 Nov 2023
On the Impact of Overparameterization on the Training of a Shallow Neural Network in High Dimensions
Simon Martin
Francis Bach
Giulio Biroli
23
9
0
07 Nov 2023
Minimizing Convex Functionals over Space of Probability Measures via KL Divergence Gradient Flow
Rentian Yao
Linjun Huang
Yun Yang
21
3
0
01 Nov 2023
Previous
1
2
3
4
5
...
8
9
10
Next