ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.00900
  4. Cited By
Algorithmic Regularization in Learning Deep Homogeneous Models: Layers
  are Automatically Balanced

Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced

4 June 2018
S. Du
Wei Hu
J. Lee
    MLT
ArXivPDFHTML

Papers citing "Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced"

50 / 74 papers shown
Title
The late-stage training dynamics of (stochastic) subgradient descent on homogeneous neural networks
Sholom Schechtman
Nicolas Schreuder
212
0
0
08 Feb 2025
An Invitation to Neuroalgebraic Geometry
An Invitation to Neuroalgebraic Geometry
Giovanni Luca Marchetti
Vahid Shahverdi
Stefano Mereta
Matthew Trager
Kathlén Kohn
119
2
0
31 Jan 2025
Geometry and Optimization of Shallow Polynomial Networks
Geometry and Optimization of Shallow Polynomial Networks
Yossi Arjevani
Joan Bruna
Joe Kileel
Elzbieta Polak
Matthew Trager
36
1
0
10 Jan 2025
How Feature Learning Can Improve Neural Scaling Laws
How Feature Learning Can Improve Neural Scaling Laws
Blake Bordelon
Alexander B. Atanasov
Cengiz Pehlevan
57
12
0
26 Sep 2024
Leveraging Continuous Time to Understand Momentum When Training Diagonal
  Linear Networks
Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks
Hristo Papazov
Scott Pesme
Nicolas Flammarion
38
5
0
08 Mar 2024
Provably Accelerating Ill-Conditioned Low-rank Estimation via Scaled
  Gradient Descent, Even with Overparameterization
Provably Accelerating Ill-Conditioned Low-rank Estimation via Scaled Gradient Descent, Even with Overparameterization
Cong Ma
Xingyu Xu
Tian Tong
Yuejie Chi
18
9
0
09 Oct 2023
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing:
  The Curses of Symmetry and Initialization
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization
Nuoya Xiong
Lijun Ding
Simon S. Du
48
11
0
03 Oct 2023
Early Neuron Alignment in Two-layer ReLU Networks with Small
  Initialization
Early Neuron Alignment in Two-layer ReLU Networks with Small Initialization
Hancheng Min
Enrique Mallada
René Vidal
MLT
38
19
0
24 Jul 2023
Addressing caveats of neural persistence with deep graph persistence
Addressing caveats of neural persistence with deep graph persistence
Leander Girrbach
Anders Christensen
Ole Winther
Zeynep Akata
A. Sophia Koepke
GNN
28
1
0
20 Jul 2023
FedBug: A Bottom-Up Gradual Unfreezing Framework for Federated Learning
FedBug: A Bottom-Up Gradual Unfreezing Framework for Federated Learning
Chia-Hsiang Kao
Yu-Chiang Frank Wang
FedML
26
1
0
19 Jul 2023
Improving Convergence and Generalization Using Parameter Symmetries
Improving Convergence and Generalization Using Parameter Symmetries
Bo Zhao
Robert Mansel Gower
Robin Walters
Rose Yu
MoMe
33
13
0
22 May 2023
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow
  Solutions in Scalar Networks and Beyond
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond
Itai Kreisler
Mor Shpigel Nacson
Daniel Soudry
Y. Carmon
33
13
0
22 May 2023
SFP: Spurious Feature-targeted Pruning for Out-of-Distribution
  Generalization
SFP: Spurious Feature-targeted Pruning for Out-of-Distribution Generalization
Yingchun Wang
Jingcai Guo
Yi Liu
Song Guo
Weizhan Zhang
Xiangyong Cao
Qinghua Zheng
AAML
OODD
33
11
0
19 May 2023
Convergence of Alternating Gradient Descent for Matrix Factorization
Convergence of Alternating Gradient Descent for Matrix Factorization
R. Ward
T. Kolda
22
6
0
11 May 2023
On the Stepwise Nature of Self-Supervised Learning
On the Stepwise Nature of Self-Supervised Learning
James B. Simon
Maksis Knutins
Liu Ziyin
Daniel Geisz
Abraham J. Fetterman
Joshua Albrecht
SSL
37
30
0
27 Mar 2023
Active Self-Supervised Learning: A Few Low-Cost Relationships Are All
  You Need
Active Self-Supervised Learning: A Few Low-Cost Relationships Are All You Need
Vivien A. Cabannes
Léon Bottou
Yann LeCun
Randall Balestriero
48
13
0
27 Mar 2023
Critical Points and Convergence Analysis of Generative Deep Linear
  Networks Trained with Bures-Wasserstein Loss
Critical Points and Convergence Analysis of Generative Deep Linear Networks Trained with Bures-Wasserstein Loss
Pierre Bréchet
Katerina Papagiannouli
Jing An
Guido Montúfar
33
3
0
06 Mar 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for
  Learning a Single Neuron
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
Weihang Xu
S. Du
37
16
0
20 Feb 2023
The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
Agustinus Kristiadi
Felix Dangel
Philipp Hennig
32
11
0
14 Feb 2023
How to prepare your task head for finetuning
How to prepare your task head for finetuning
Yi Ren
Shangmin Guo
Wonho Bae
Danica J. Sutherland
24
14
0
11 Feb 2023
On a continuous time model of gradient descent dynamics and instability
  in deep learning
On a continuous time model of gradient descent dynamics and instability in deep learning
Mihaela Rosca
Yan Wu
Chongli Qin
Benoit Dherin
20
7
0
03 Feb 2023
Effects of Data Geometry in Early Deep Learning
Effects of Data Geometry in Early Deep Learning
Saket Tiwari
George Konidaris
82
7
0
29 Dec 2022
Infinite-width limit of deep linear neural networks
Infinite-width limit of deep linear neural networks
Lénaïc Chizat
Maria Colombo
Xavier Fernández-Real
Alessio Figalli
31
14
0
29 Nov 2022
Mechanistic Mode Connectivity
Mechanistic Mode Connectivity
Ekdeep Singh Lubana
Eric J. Bigelow
Robert P. Dick
David M. Krueger
Hidenori Tanaka
34
45
0
15 Nov 2022
Symmetries, flat minima, and the conserved quantities of gradient flow
Symmetries, flat minima, and the conserved quantities of gradient flow
Bo Zhao
I. Ganev
Robin Walters
Rose Yu
Nima Dehmamy
47
16
0
31 Oct 2022
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for
  Language Models
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
40
49
0
25 Oct 2022
Surgical Fine-Tuning Improves Adaptation to Distribution Shifts
Surgical Fine-Tuning Improves Adaptation to Distribution Shifts
Yoonho Lee
Annie S. Chen
Fahim Tajwar
Ananya Kumar
Huaxiu Yao
Percy Liang
Chelsea Finn
OOD
63
198
0
20 Oct 2022
Wasserstein Barycenter-based Model Fusion and Linear Mode Connectivity
  of Neural Networks
Wasserstein Barycenter-based Model Fusion and Linear Mode Connectivity of Neural Networks
A. K. Akash
Sixu Li
Nicolas García Trillos
34
12
0
13 Oct 2022
Boosting Adversarial Robustness From The Perspective of Effective Margin
  Regularization
Boosting Adversarial Robustness From The Perspective of Effective Margin Regularization
Ziquan Liu
Antoni B. Chan
AAML
33
5
0
11 Oct 2022
Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear
  Functions
Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear Functions
Arthur Jacot
41
25
0
29 Sep 2022
Magnitude and Angle Dynamics in Training Single ReLU Neurons
Magnitude and Angle Dynamics in Training Single ReLU Neurons
Sangmin Lee
Byeongsu Sim
Jong Chul Ye
MLT
96
6
0
27 Sep 2022
A Validation Approach to Over-parameterized Matrix and Image Recovery
A Validation Approach to Over-parameterized Matrix and Image Recovery
Lijun Ding
Zhen Qin
Liwei Jiang
Jinxin Zhou
Zhihui Zhu
48
13
0
21 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the
  ugly (initialization)
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization)
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
V. Cevher
39
19
0
15 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms
On the Implicit Bias in Deep-Learning Algorithms
Gal Vardi
FedML
AI4CE
34
72
0
26 Aug 2022
Implicit Regularization with Polynomial Growth in Deep Tensor
  Factorization
Implicit Regularization with Polynomial Growth in Deep Tensor Factorization
Kais Hariz
Hachem Kadri
Stéphane Ayache
Maher Moakher
Thierry Artières
26
2
0
18 Jul 2022
Blessing of Nonconvexity in Deep Linear Models: Depth Flattens the
  Optimization Landscape Around the True Solution
Blessing of Nonconvexity in Deep Linear Models: Depth Flattens the Optimization Landscape Around the True Solution
Jianhao Ma
S. Fattahi
44
5
0
15 Jul 2022
Symmetry Teleportation for Accelerated Optimization
Symmetry Teleportation for Accelerated Optimization
B. Zhao
Nima Dehmamy
Robin Walters
Rose Yu
ODL
23
20
0
21 May 2022
Algorithmic Regularization in Model-free Overparametrized Asymmetric
  Matrix Factorization
Algorithmic Regularization in Model-free Overparametrized Asymmetric Matrix Factorization
Liwei Jiang
Yudong Chen
Lijun Ding
43
26
0
06 Mar 2022
Understanding Deep Contrastive Learning via Coordinate-wise Optimization
Understanding Deep Contrastive Learning via Coordinate-wise Optimization
Yuandong Tian
52
34
0
29 Jan 2022
Implicit Regularization in Hierarchical Tensor Factorization and Deep
  Convolutional Neural Networks
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks
Noam Razin
Asaf Maman
Nadav Cohen
46
29
0
27 Jan 2022
Regularization by Misclassification in ReLU Neural Networks
Regularization by Misclassification in ReLU Neural Networks
Elisabetta Cornacchia
Jan Hązła
Ido Nachum
Amir Yehudayoff
NoLa
25
2
0
03 Nov 2021
Neural Networks as Kernel Learners: The Silent Alignment Effect
Neural Networks as Kernel Learners: The Silent Alignment Effect
Alexander B. Atanasov
Blake Bordelon
Cengiz Pehlevan
MLT
26
75
0
29 Oct 2021
On the Regularization of Autoencoders
On the Regularization of Autoencoders
Harald Steck
Dario Garcia-Garcia
SSL
AI4CE
30
4
0
21 Oct 2021
Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect
Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect
Yuqing Wang
Minshuo Chen
T. Zhao
Molei Tao
AI4CE
57
40
0
07 Oct 2021
On Margin Maximization in Linear and ReLU Networks
On Margin Maximization in Linear and ReLU Networks
Gal Vardi
Ohad Shamir
Nathan Srebro
52
28
0
06 Oct 2021
Nonconvex Factorization and Manifold Formulations are Almost Equivalent
  in Low-rank Matrix Optimization
Nonconvex Factorization and Manifold Formulations are Almost Equivalent in Low-rank Matrix Optimization
Yuetian Luo
Xudong Li
Anru R. Zhang
33
9
0
03 Aug 2021
Convergence analysis for gradient flows in the training of artificial
  neural networks with ReLU activation
Convergence analysis for gradient flows in the training of artificial neural networks with ReLU activation
Arnulf Jentzen
Adrian Riekert
27
23
0
09 Jul 2021
A Mechanism for Producing Aligned Latent Spaces with Autoencoders
A Mechanism for Producing Aligned Latent Spaces with Autoencoders
Saachi Jain
Adityanarayanan Radhakrishnan
Caroline Uhler
24
9
0
29 Jun 2021
Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix
  Factorization
Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix Factorization
Tian-Chun Ye
S. Du
21
46
0
27 Jun 2021
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix
  Factorization
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization
Tianyi Liu
Yan Li
S. Wei
Enlu Zhou
T. Zhao
21
13
0
24 Feb 2021
12
Next