ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.04836
  4. Cited By
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima

On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima

15 September 2016
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
    ODL
ArXivPDFHTML

Papers citing "On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima"

50 / 514 papers shown
Title
NG+ : A Multi-Step Matrix-Product Natural Gradient Method for Deep
  Learning
NG+ : A Multi-Step Matrix-Product Natural Gradient Method for Deep Learning
Minghan Yang
Dong Xu
Qiwen Cui
Zaiwen Wen
Pengxiang Xu
13
4
0
14 Jun 2021
RDA: Robust Domain Adaptation via Fourier Adversarial Attacking
RDA: Robust Domain Adaptation via Fourier Adversarial Attacking
Jiaxing Huang
Dayan Guan
Aoran Xiao
Shijian Lu
AAML
37
76
0
05 Jun 2021
Post-mortem on a deep learning contest: a Simpson's paradox and the
  complementary roles of scale metrics versus shape metrics
Post-mortem on a deep learning contest: a Simpson's paradox and the complementary roles of scale metrics versus shape metrics
Charles H. Martin
Michael W. Mahoney
18
19
0
01 Jun 2021
Concurrent Adversarial Learning for Large-Batch Training
Concurrent Adversarial Learning for Large-Batch Training
Yong Liu
Xiangning Chen
Minhao Cheng
Cho-Jui Hsieh
Yang You
ODL
28
13
0
01 Jun 2021
Variational Quantum Classifiers Through the Lens of the Hessian
Variational Quantum Classifiers Through the Lens of the Hessian
Pinaki Sen
Amandeep Singh Bhatia
A. Bhatia
Ahmed Elbeltagi
15
24
0
21 May 2021
ResMLP: Feedforward networks for image classification with
  data-efficient training
ResMLP: Feedforward networks for image classification with data-efficient training
Hugo Touvron
Piotr Bojanowski
Mathilde Caron
Matthieu Cord
Alaaeldin El-Nouby
...
Gautier Izacard
Armand Joulin
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
VLM
21
656
0
07 May 2021
Poisoning the Unlabeled Dataset of Semi-Supervised Learning
Poisoning the Unlabeled Dataset of Semi-Supervised Learning
Nicholas Carlini
AAML
149
68
0
04 May 2021
InfoNEAT: Information Theory-based NeuroEvolution of Augmenting
  Topologies for Side-channel Analysis
InfoNEAT: Information Theory-based NeuroEvolution of Augmenting Topologies for Side-channel Analysis
R. Acharya
F. Ganji
Domenic Forte
AAML
38
24
0
30 Apr 2021
Relating Adversarially Robust Generalization to Flat Minima
Relating Adversarially Robust Generalization to Flat Minima
David Stutz
Matthias Hein
Bernt Schiele
OOD
32
65
0
09 Apr 2021
Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to
  Improve Generalization
Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization
Zeke Xie
Li-xin Yuan
Zhanxing Zhu
Masashi Sugiyama
21
29
0
31 Mar 2021
Efficient Deep Learning Pipelines for Accurate Cost Estimations Over
  Large Scale Query Workload
Efficient Deep Learning Pipelines for Accurate Cost Estimations Over Large Scale Query Workload
Johan Kok
Zhi Kang
S. Tan
Feng Cheng
Shixuan Sun
Bingsheng He
24
26
0
23 Mar 2021
Student Network Learning via Evolutionary Knowledge Distillation
Student Network Learning via Evolutionary Knowledge Distillation
Kangkai Zhang
Chunhui Zhang
Shikun Li
Dan Zeng
Shiming Ge
22
83
0
23 Mar 2021
Interpretable Machine Learning: Fundamental Principles and 10 Grand
  Challenges
Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges
Cynthia Rudin
Chaofan Chen
Zhi Chen
Haiyang Huang
Lesia Semenova
Chudi Zhong
FaML
AI4CE
LRM
59
653
0
20 Mar 2021
Large Batch Simulation for Deep Reinforcement Learning
Large Batch Simulation for Deep Reinforcement Learning
Brennan Shacklett
Erik Wijmans
Aleksei Petrenko
Manolis Savva
Dhruv Batra
V. Koltun
Kayvon Fatahalian
3DV
OffRL
AI4CE
27
26
0
12 Mar 2021
Intraclass clustering: an implicit learning ability that regularizes
  DNNs
Intraclass clustering: an implicit learning ability that regularizes DNNs
Simon Carbonnelle
Christophe De Vleeschouwer
57
8
0
11 Mar 2021
Siamese Labels Auxiliary Learning
Siamese Labels Auxiliary Learning
Wenrui Gan
Zhulin Liu
Cheng Chen
Tong Zhang
17
1
0
27 Feb 2021
On the Validity of Modeling SGD with Stochastic Differential Equations
  (SDEs)
On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs)
Zhiyuan Li
Sadhika Malladi
Sanjeev Arora
41
78
0
24 Feb 2021
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix
  Factorization
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization
Tianyi Liu
Yan Li
S. Wei
Enlu Zhou
T. Zhao
21
13
0
24 Feb 2021
The Promises and Pitfalls of Deep Kernel Learning
The Promises and Pitfalls of Deep Kernel Learning
Sebastian W. Ober
C. Rasmussen
Mark van der Wilk
UQCV
BDL
21
107
0
24 Feb 2021
ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning
  of Deep Neural Networks
ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks
Jungmin Kwon
Jeongseop Kim
Hyunseong Park
I. Choi
33
281
0
23 Feb 2021
Formal Language Theory Meets Modern NLP
Formal Language Theory Meets Modern NLP
William Merrill
AI4CE
NAI
16
12
0
19 Feb 2021
Consistent Lock-free Parallel Stochastic Gradient Descent for Fast and
  Stable Convergence
Consistent Lock-free Parallel Stochastic Gradient Descent for Fast and Stable Convergence
Karl Bäckström
Ivan Walulya
Marina Papatriantafilou
P. Tsigas
23
5
0
17 Feb 2021
Consensus Control for Decentralized Deep Learning
Consensus Control for Decentralized Deep Learning
Lingjing Kong
Tao R. Lin
Anastasia Koloskova
Martin Jaggi
Sebastian U. Stich
19
75
0
09 Feb 2021
Adversarial Training Makes Weight Loss Landscape Sharper in Logistic
  Regression
Adversarial Training Makes Weight Loss Landscape Sharper in Logistic Regression
Masanori Yamada
Sekitoshi Kanai
Tomoharu Iwata
Tomokatsu Takahashi
Yuki Yamanaka
Hiroshi Takahashi
Atsutoshi Kumagai
AAML
16
9
0
05 Feb 2021
Predicting the Mechanical Properties of Biopolymer Gels Using Neural
  Networks Trained on Discrete Fiber Network Data
Predicting the Mechanical Properties of Biopolymer Gels Using Neural Networks Trained on Discrete Fiber Network Data
Yue Leng
Vahidullah Tac
S. Calve
A. B. Tepole
20
32
0
23 Jan 2021
Optimizing Deeper Transformers on Small Datasets
Optimizing Deeper Transformers on Small Datasets
Peng-Tao Xu
Dhruv Kumar
Wei Yang
Wenjie Zi
Keyi Tang
Chenyang Huang
Jackie C.K. Cheung
S. Prince
Yanshuai Cao
AI4CE
24
68
0
30 Dec 2020
Combating Mode Collapse in GAN training: An Empirical Analysis using
  Hessian Eigenvalues
Combating Mode Collapse in GAN training: An Empirical Analysis using Hessian Eigenvalues
Ricard Durall
Avraam Chatzimichailidis
P. Labus
J. Keuper
GAN
25
57
0
17 Dec 2020
DeepLesionBrain: Towards a broader deep-learning generalization for
  multiple sclerosis lesion segmentation
DeepLesionBrain: Towards a broader deep-learning generalization for multiple sclerosis lesion segmentation
R. A. Kamraoui
Vinh-Thong Ta
T. Tourdias
Boris Mansencal
J. V. Manjón
Pierrick Coupé
OOD
31
50
0
14 Dec 2020
A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and
  its Applications to Regularization
A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization
Adepu Ravi Sankar
Yash Khasbage
Rahul Vigneswaran
V. Balasubramanian
25
42
0
07 Dec 2020
Noise and Fluctuation of Finite Learning Rate Stochastic Gradient
  Descent
Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent
Kangqiao Liu
Liu Ziyin
Masakuni Ueda
MLT
61
37
0
07 Dec 2020
Parallel Blockwise Knowledge Distillation for Deep Neural Network
  Compression
Parallel Blockwise Knowledge Distillation for Deep Neural Network Compression
Cody Blakeney
Xiaomin Li
Yan Yan
Ziliang Zong
46
39
0
05 Dec 2020
Dynamic Curriculum Learning for Low-Resource Neural Machine Translation
Dynamic Curriculum Learning for Low-Resource Neural Machine Translation
Chen Xu
Bojie Hu
Yufan Jiang
Kai Feng
Zeyang Wang
Shen Huang
Qi Ju
Tong Xiao
Jingbo Zhu
15
22
0
30 Nov 2020
EvoPose2D: Pushing the Boundaries of 2D Human Pose Estimation using
  Accelerated Neuroevolution with Weight Transfer
EvoPose2D: Pushing the Boundaries of 2D Human Pose Estimation using Accelerated Neuroevolution with Weight Transfer
William J. McNally
Kanav Vats
Alexander Wong
J. McPhee
3DH
30
16
0
17 Nov 2020
A Random Matrix Theory Approach to Damping in Deep Learning
A Random Matrix Theory Approach to Damping in Deep Learning
Diego Granziol
Nicholas P. Baskerville
AI4CE
ODL
29
2
0
15 Nov 2020
Artificial Neural Variability for Deep Learning: On Overfitting, Noise
  Memorization, and Catastrophic Forgetting
Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting
Zeke Xie
Fengxiang He
Shaopeng Fu
Issei Sato
Dacheng Tao
Masashi Sugiyama
21
59
0
12 Nov 2020
Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign
  Dropout
Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign Dropout
Zhao Chen
Jiquan Ngiam
Yanping Huang
Thang Luong
Henrik Kretzschmar
Yuning Chai
Dragomir Anguelov
41
206
0
14 Oct 2020
Regularizing Neural Networks via Adversarial Model Perturbation
Regularizing Neural Networks via Adversarial Model Perturbation
Yaowei Zheng
Richong Zhang
Yongyi Mao
AAML
30
95
0
10 Oct 2020
Towards a Scalable and Distributed Infrastructure for Deep Learning
  Applications
Towards a Scalable and Distributed Infrastructure for Deep Learning Applications
Bita Hasheminezhad
S. Shirzad
Nanmiao Wu
Patrick Diehl
Hannes Schulz
Hartmut Kaiser
GNN
AI4CE
27
4
0
06 Oct 2020
Regularizing Dialogue Generation by Imitating Implicit Scenarios
Regularizing Dialogue Generation by Imitating Implicit Scenarios
Shaoxiong Feng
Xuancheng Ren
Hongshen Chen
Bin Sun
Kan Li
Xu Sun
18
20
0
05 Oct 2020
Sharpness-Aware Minimization for Efficiently Improving Generalization
Sharpness-Aware Minimization for Efficiently Improving Generalization
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
AAML
101
1,278
0
03 Oct 2020
Effective Regularization Through Loss-Function Metalearning
Effective Regularization Through Loss-Function Metalearning
Santiago Gonzalez
Risto Miikkulainen
24
5
0
02 Oct 2020
Improved generalization by noise enhancement
Improved generalization by noise enhancement
Takashi Mori
Masahito Ueda
18
3
0
28 Sep 2020
Towards a Mathematical Understanding of Neural Network-Based Machine
  Learning: what we know and what we don't
Towards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't
E. Weinan
Chao Ma
Stephan Wojtowytsch
Lei Wu
AI4CE
22
133
0
22 Sep 2020
VirtualFlow: Decoupling Deep Learning Models from the Underlying
  Hardware
VirtualFlow: Decoupling Deep Learning Models from the Underlying Hardware
Andrew Or
Haoyu Zhang
M. Freedman
12
9
0
20 Sep 2020
Review: Deep Learning in Electron Microscopy
Review: Deep Learning in Electron Microscopy
Jeffrey M. Ede
34
79
0
17 Sep 2020
Analysis of Generalizability of Deep Neural Networks Based on the
  Complexity of Decision Boundary
Analysis of Generalizability of Deep Neural Networks Based on the Complexity of Decision Boundary
Shuyue Guan
Murray H. Loew
22
25
0
16 Sep 2020
Self-Adaptive Physics-Informed Neural Networks using a Soft Attention
  Mechanism
Self-Adaptive Physics-Informed Neural Networks using a Soft Attention Mechanism
L. McClenny
U. Braga-Neto
PINN
28
443
0
07 Sep 2020
Predicting Training Time Without Training
Predicting Training Time Without Training
L. Zancato
Alessandro Achille
Avinash Ravichandran
Rahul Bhotika
Stefano Soatto
20
24
0
28 Aug 2020
Optimizing Information Loss Towards Robust Neural Networks
Optimizing Information Loss Towards Robust Neural Networks
Philip Sperl
Konstantin Böttinger
AAML
13
3
0
07 Aug 2020
Communication-Efficient and Distributed Learning Over Wireless Networks:
  Principles and Applications
Communication-Efficient and Distributed Learning Over Wireless Networks: Principles and Applications
Jihong Park
S. Samarakoon
Anis Elgabli
Joongheon Kim
M. Bennis
Seong-Lyun Kim
Mérouane Debbah
34
161
0
06 Aug 2020
Previous
123...10116789
Next