ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2201.08025
  4. Cited By
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning
  Optimization Landscape

Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape

20 January 2022
Devansh Bisla
Jing Wang
A. Choromańska
ArXivPDFHTML

Papers citing "Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape"

50 / 86 papers shown
Title
Enhancing Domain Adaptation through Prompt Gradient Alignment
Enhancing Domain Adaptation through Prompt Gradient Alignment
Hoang Phan
Lam C. Tran
Quyen Tran
Trung Le
70
0
0
13 Jun 2024
ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning
  of Deep Neural Networks
ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks
Jungmin Kwon
Jeongseop Kim
Hyunseong Park
I. Choi
73
288
0
23 Feb 2021
SWAD: Domain Generalization by Seeking Flat Minima
SWAD: Domain Generalization by Seeking Flat Minima
Junbum Cha
Sanghyuk Chun
Kyungjae Lee
Han-Cheol Cho
Seunghyun Park
Yunsung Lee
Sungrae Park
MoMe
267
438
0
17 Feb 2021
Sharpness-Aware Minimization for Efficiently Improving Generalization
Sharpness-Aware Minimization for Efficiently Improving Generalization
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
AAML
157
1,314
0
03 Oct 2020
Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses
Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses
Raef Bassily
Vitaly Feldman
Cristóbal Guzmán
Kunal Talwar
MLT
41
193
0
12 Jun 2020
Extrapolation for Large-batch Training in Deep Learning
Extrapolation for Large-batch Training in Deep Learning
Tao R. Lin
Lingjing Kong
Sebastian U. Stich
Martin Jaggi
54
36
0
10 Jun 2020
How neural networks find generalizable solutions: Self-tuned annealing
  in deep learning
How neural networks find generalizable solutions: Self-tuned annealing in deep learning
Yu Feng
Y. Tu
MLT
14
9
0
06 Jan 2020
Deep Double Descent: Where Bigger Models and More Data Hurt
Deep Double Descent: Where Bigger Models and More Data Hurt
Preetum Nakkiran
Gal Kaplun
Yamini Bansal
Tristan Yang
Boaz Barak
Ilya Sutskever
97
925
0
04 Dec 2019
Fantastic Generalization Measures and Where to Find Them
Fantastic Generalization Measures and Where to Find Them
Yiding Jiang
Behnam Neyshabur
H. Mobahi
Dilip Krishnan
Samy Bengio
AI4CE
56
599
0
04 Dec 2019
Pure and Spurious Critical Points: a Geometric Study of Linear Networks
Pure and Spurious Critical Points: a Geometric Study of Linear Networks
Matthew Trager
Kathlén Kohn
Joan Bruna
27
27
0
03 Oct 2019
Gradient Noise Convolution (GNC): Smoothing Loss Function for
  Distributed Large-Batch SGD
Gradient Noise Convolution (GNC): Smoothing Loss Function for Distributed Large-Batch SGD
Kosuke Haruki
Taiji Suzuki
Yohei Hamakawa
Takeshi Toda
Ryuji Sakai
M. Ozawa
Mitsuhiro Kimura
ODL
25
17
0
26 Jun 2019
Stochastic Gradient Methods with Layer-wise Adaptive Moments for
  Training of Deep Networks
Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks
Boris Ginsburg
P. Castonguay
Oleksii Hrinchuk
Oleksii Kuchaiev
Vitaly Lavrukhin
Ryan Leary
Jason Chun Lok Li
Huyen Nguyen
Yang Zhang
Jonathan M. Cohen
ODL
42
13
0
27 May 2019
Adaptive Gradient Methods with Dynamic Bound of Learning Rate
Adaptive Gradient Methods with Dynamic Bound of Learning Rate
Liangchen Luo
Yuanhao Xiong
Yan Liu
Xu Sun
ODL
46
600
0
26 Feb 2019
An Investigation into Neural Net Optimization via Hessian Eigenvalue
  Density
An Investigation into Neural Net Optimization via Hessian Eigenvalue Density
Behrooz Ghorbani
Shankar Krishnan
Ying Xiao
ODL
36
320
0
29 Jan 2019
A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural
  Networks
A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks
Umut Simsekli
Levent Sagun
Mert Gurbuzbalaban
75
241
0
18 Jan 2019
Reconciling modern machine learning practice and the bias-variance
  trade-off
Reconciling modern machine learning practice and the bias-variance trade-off
M. Belkin
Daniel J. Hsu
Siyuan Ma
Soumik Mandal
168
1,628
0
28 Dec 2018
A Convergence Analysis of Gradient Descent for Deep Linear Neural
  Networks
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
Sanjeev Arora
Nadav Cohen
Noah Golowich
Wei Hu
89
284
0
04 Oct 2018
DeepCloak: Adversarial Crafting As a Defensive Measure to Cloak
  Processes
DeepCloak: Adversarial Crafting As a Defensive Measure to Cloak Processes
Mehmet Sinan Inci
T. Eisenbarth
B. Sunar
AAML
37
8
0
03 Aug 2018
Assessing the Scalability of Biologically-Motivated Deep Learning
  Algorithms and Architectures
Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures
Sergey Bartunov
Adam Santoro
Blake A. Richards
Luke Marris
Geoffrey E. Hinton
Timothy Lillicrap
82
242
0
12 Jul 2018
Beyond Backprop: Online Alternating Minimization with Auxiliary
  Variables
Beyond Backprop: Online Alternating Minimization with Auxiliary Variables
A. Choromańska
Benjamin Cowen
Yara Rizk
Ronny Luss
Mattia Rigotti
...
Brian Kingsbury
Paolo Diachille
V. Gurev
Ravi Tejwani
Djallel Bouneffouf
39
53
0
24 Jun 2018
Lifted Neural Networks
Lifted Neural Networks
Armin Askari
Geoffrey Negiar
Rajiv Sambharya
L. Ghaoui
81
37
0
03 May 2018
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost
Noam M. Shazeer
Mitchell Stern
ODL
45
1,032
0
11 Apr 2018
A Proximal Block Coordinate Descent Algorithm for Deep Neural Network
  Training
A Proximal Block Coordinate Descent Algorithm for Deep Neural Network Training
Tim Tsz-Kit Lau
Jinshan Zeng
Baoyuan Wu
Yuan Yao
ODL
28
33
0
24 Mar 2018
Averaging Weights Leads to Wider Optima and Better Generalization
Averaging Weights Leads to Wider Optima and Better Generalization
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
FedML
MoMe
91
1,643
0
14 Mar 2018
On the Power of Over-parametrization in Neural Networks with Quadratic
  Activation
On the Power of Over-parametrization in Neural Networks with Quadratic Activation
S. Du
Jason D. Lee
95
268
0
03 Mar 2018
Essentially No Barriers in Neural Network Energy Landscape
Essentially No Barriers in Neural Network Energy Landscape
Felix Dräxler
K. Veschgini
M. Salmhofer
Fred Hamprecht
MoMe
97
430
0
02 Mar 2018
LSALSA: Accelerated Source Separation via Learned Sparse Coding
LSALSA: Accelerated Source Separation via Learned Sparse Coding
Benjamin Cowen
Apoorva Nandini Saridena
A. Choromańska
19
11
0
13 Feb 2018
Spurious Local Minima are Common in Two-Layer ReLU Neural Networks
Spurious Local Minima are Common in Two-Layer ReLU Neural Networks
Itay Safran
Ohad Shamir
112
263
0
24 Dec 2017
Mathematics of Deep Learning
Mathematics of Deep Learning
René Vidal
Joan Bruna
Raja Giryes
Stefano Soatto
OOD
46
120
0
13 Dec 2017
Convergent Block Coordinate Descent for Training Tikhonov Regularized
  Deep Neural Networks
Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks
Ziming Zhang
M. Brand
35
70
0
20 Nov 2017
Fisher-Rao Metric, Geometry, and Complexity of Neural Networks
Fisher-Rao Metric, Geometry, and Complexity of Neural Networks
Tengyuan Liang
T. Poggio
Alexander Rakhlin
J. Stokes
51
225
0
05 Nov 2017
SGD Learns Over-parameterized Networks that Provably Generalize on
  Linearly Separable Data
SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data
Alon Brutzkus
Amir Globerson
Eran Malach
Shai Shalev-Shwartz
MLT
137
277
0
27 Oct 2017
Improved Regularization of Convolutional Neural Networks with Cutout
Improved Regularization of Convolutional Neural Networks with Cutout
Terrance Devries
Graham W. Taylor
81
3,739
0
15 Aug 2017
Global optimality conditions for deep neural networks
Global optimality conditions for deep neural networks
Chulhee Yun
S. Sra
Ali Jadbabaie
134
118
0
08 Jul 2017
Exploring Generalization in Deep Learning
Exploring Generalization in Deep Learning
Behnam Neyshabur
Srinadh Bhojanapalli
David A. McAllester
Nathan Srebro
FAtt
132
1,245
0
27 Jun 2017
Towards Deep Learning Models Resistant to Adversarial Attacks
Towards Deep Learning Models Resistant to Adversarial Attacks
Aleksander Madry
Aleksandar Makelov
Ludwig Schmidt
Dimitris Tsipras
Adrian Vladu
SILM
OOD
219
11,962
0
19 Jun 2017
Empirical Analysis of the Hessian of Over-Parametrized Neural Networks
Empirical Analysis of the Hessian of Over-Parametrized Neural Networks
Levent Sagun
Utku Evci
V. U. Güney
Yann N. Dauphin
Léon Bottou
39
415
0
14 Jun 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
430
129,831
0
12 Jun 2017
Recovery Guarantees for One-hidden-layer Neural Networks
Recovery Guarantees for One-hidden-layer Neural Networks
Kai Zhong
Zhao Song
Prateek Jain
Peter L. Bartlett
Inderjit S. Dhillon
MLT
98
336
0
10 Jun 2017
Convergence Analysis of Two-layer Neural Networks with ReLU Activation
Convergence Analysis of Two-layer Neural Networks with ReLU Activation
Yuanzhi Li
Yang Yuan
MLT
77
649
0
28 May 2017
The Marginal Value of Adaptive Gradient Methods in Machine Learning
The Marginal Value of Adaptive Gradient Methods in Machine Learning
Ashia Wilson
Rebecca Roelofs
Mitchell Stern
Nathan Srebro
Benjamin Recht
ODL
48
1,023
0
23 May 2017
The loss surface of deep and wide neural networks
The loss surface of deep and wide neural networks
Quynh N. Nguyen
Matthias Hein
ODL
77
284
0
26 Apr 2017
No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified
  Geometric Analysis
No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis
Rong Ge
Chi Jin
Yi Zheng
111
435
0
03 Apr 2017
Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural
  Networks with Many More Parameters than Training Data
Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data
Gintare Karolina Dziugaite
Daniel M. Roy
77
808
0
31 Mar 2017
Sharp Minima Can Generalize For Deep Nets
Sharp Minima Can Generalize For Deep Nets
Laurent Dinh
Razvan Pascanu
Samy Bengio
Yoshua Bengio
ODL
98
763
0
15 Mar 2017
An Analytical Formula of Population Gradient for two-layered ReLU
  network and its Applications in Convergence and Critical Point Analysis
An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis
Yuandong Tian
MLT
109
216
0
02 Mar 2017
Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs
Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs
Alon Brutzkus
Amir Globerson
MLT
89
313
0
26 Feb 2017
Training Deep Neural Networks via Optimization Over Graphs
Training Deep Neural Networks via Optimization Over Graphs
Guoqiang Zhang
W. Kleijn
GNN
28
7
0
11 Feb 2017
Regularizing Neural Networks by Penalizing Confident Output
  Distributions
Regularizing Neural Networks by Penalizing Confident Output Distributions
Gabriel Pereyra
George Tucker
J. Chorowski
Lukasz Kaiser
Geoffrey E. Hinton
NoLa
107
1,133
0
23 Jan 2017
Identity Matters in Deep Learning
Identity Matters in Deep Learning
Moritz Hardt
Tengyu Ma
OOD
53
398
0
14 Nov 2016
12
Next