Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.04836
Cited By
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
15 September 2016
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima"
50 / 514 papers shown
Title
Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt
Sören Mindermann
J. Brauner
Muhammed Razzak
Mrinank Sharma
Andreas Kirsch
...
Benedikt Höltgen
Aidan N. Gomez
Adrien Morisot
Sebastian Farquhar
Y. Gal
60
148
0
14 Jun 2022
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Kaifeng Lyu
Zhiyuan Li
Sanjeev Arora
FAtt
40
69
0
14 Jun 2022
Distributed Adversarial Training to Robustify Deep Neural Networks at Scale
Gaoyuan Zhang
Songtao Lu
Yihua Zhang
Xiangyi Chen
Pin-Yu Chen
Quanfu Fan
Lee Martie
L. Horesh
Min-Fong Hong
Sijia Liu
OOD
27
12
0
13 Jun 2022
Towards Understanding Sharpness-Aware Minimization
Maksym Andriushchenko
Nicolas Flammarion
AAML
32
133
0
13 Jun 2022
Trajectory-dependent Generalization Bounds for Deep Neural Networks via Fractional Brownian Motion
Chengli Tan
Jiang Zhang
Junmin Liu
35
1
0
09 Jun 2022
Generalized Federated Learning via Sharpness Aware Minimization
Zhe Qu
Xingyu Li
Rui Duan
Yaojiang Liu
Bo Tang
Zhuo Lu
FedML
31
131
0
06 Jun 2022
Two Decades of Bengali Handwritten Digit Recognition: A Survey
A. A. Ashikur Rahman
Md. Bakhtiar Hasan
Sabbir Ahmed
Tasnim Ahmed
Md. Hamjajul Ashmafee
Mohammad Ridwan Kabir
M. H. Kabir
30
24
0
05 Jun 2022
Embedding Principle in Depth for the Loss Landscape Analysis of Deep Neural Networks
Zhiwei Bai
Tao Luo
Z. Xu
Yaoyu Zhang
31
4
0
26 May 2022
Linear Connectivity Reveals Generalization Strategies
Jeevesh Juneja
Rachit Bansal
Kyunghyun Cho
João Sedoc
Naomi Saphra
242
45
0
24 May 2022
Use of Transformer-Based Models for Word-Level Transliteration of the Book of the Dean of Lismore
Edward Gow-Smith
Mark McConville
W. Gillies
Jade Scott
R. Maolalaigh
AI4CE
21
2
0
23 May 2022
Scalable algorithms for physics-informed neural and graph networks
K. Shukla
Mengjia Xu
N. Trask
George Karniadakis
PINN
AI4CE
72
40
0
16 May 2022
Investigating Generalization by Controlling Normalized Margin
Alexander R. Farhang
Jeremy Bernstein
Kushal Tirumala
Yang Liu
Yisong Yue
31
6
0
08 May 2022
Federated Geometric Monte Carlo Clustering to Counter Non-IID Datasets
Federico Lucchetti
Jérémie Decouchant
Maria Fernandes
L. Chen
Marcus Volp
FedML
17
0
0
23 Apr 2022
CowClip: Reducing CTR Prediction Model Training Time from 12 hours to 10 minutes on 1 GPU
Zangwei Zheng
Peng Xu
Xuan Zou
Da Tang
Zhen Li
...
Xiangzhuo Ding
Fuzhao Xue
Ziheng Qing
Youlong Cheng
Yang You
VLM
44
7
0
13 Apr 2022
The Group Loss++: A deeper look into group loss for deep metric learning
Ismail Elezi
Jenny Seidenschwarz
Laurin Wagner
Sebastiano Vascon
Alessandro Torcinovich
Marcello Pelillo
Laura Leal-Taixe
24
12
0
04 Apr 2022
Concept Evolution in Deep Learning Training: A Unified Interpretation Framework and Discoveries
Haekyu Park
Seongmin Lee
Benjamin Hoover
Austin P. Wright
Omar Shaikh
Rahul Duggal
Nilaksh Das
Kevin Li
Judy Hoffman
Duen Horng Chau
24
2
0
30 Mar 2022
A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration
R. Hebbalaguppe
Jatin Prakash
Neelabh Madan
Chetan Arora
UQCV
25
42
0
25 Mar 2022
A Comparative Survey of Deep Active Learning
Xueying Zhan
Qingzhong Wang
Kuan-Hao Huang
Haoyi Xiong
Dejing Dou
Antoni B. Chan
FedML
HAI
24
105
0
25 Mar 2022
Small Batch Sizes Improve Training of Low-Resource Neural MT
Àlex R. Atrio
Andrei Popescu-Belis
32
6
0
20 Mar 2022
PACE: A Parallelizable Computation Encoder for Directed Acyclic Graphs
Zehao Dong
Muhan Zhang
Fuhai Li
Yixin Chen
CML
GNN
33
17
0
19 Mar 2022
Incremental Few-Shot Learning via Implanting and Compressing
Yiting Li
H. Zhu
Xijia Feng
Zilong Cheng
Jun Ma
Cheng Xiang
P. Vadakkepat
T. Lee
CLL
VLM
21
2
0
19 Mar 2022
Enhancing Adversarial Training with Second-Order Statistics of Weights
Gao Jin
Xinping Yi
Wei Huang
S. Schewe
Xiaowei Huang
AAML
26
47
0
11 Mar 2022
QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization
Xiuying Wei
Ruihao Gong
Yuhang Li
Xianglong Liu
F. Yu
MQ
VLM
19
166
0
11 Mar 2022
Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
Weixin Liang
Yuhui Zhang
Yongchan Kwon
Serena Yeung
James Zou
VLM
40
388
0
03 Mar 2022
Adversarial robustness of sparse local Lipschitz predictors
Ramchandran Muthukumar
Jeremias Sulam
AAML
32
13
0
26 Feb 2022
On PAC-Bayesian reconstruction guarantees for VAEs
Badr-Eddine Chérief-Abdellatif
Yuyang Shi
Arnaud Doucet
Benjamin Guedj
DRL
47
17
0
23 Feb 2022
Privacy Leakage of Adversarial Training Models in Federated Learning Systems
Jingyang Zhang
Yiran Chen
Hai Helen Li
FedML
PICV
29
15
0
21 Feb 2022
Survey on Large Scale Neural Network Training
Julia Gusak
Daria Cherniuk
Alena Shilova
A. Katrutsa
Daniel Bershatsky
...
Lionel Eyraud-Dubois
Oleg Shlyazhko
Denis Dimitrov
Ivan V. Oseledets
Olivier Beaumont
22
10
0
21 Feb 2022
Tackling benign nonconvexity with smoothing and stochastic gradients
Harsh Vardhan
Sebastian U. Stich
26
8
0
18 Feb 2022
How Do Vision Transformers Work?
Namuk Park
Songkuk Kim
ViT
44
465
0
14 Feb 2022
PFGE: Parsimonious Fast Geometric Ensembling of DNNs
Hao Guo
Jiyong Jin
B. Liu
FedML
29
1
0
14 Feb 2022
Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning
Yang Zhao
Hao Zhang
Xiuyuan Hu
32
116
0
08 Feb 2022
Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably
Tianyi Liu
Yan Li
Enlu Zhou
Tuo Zhao
38
1
0
07 Feb 2022
Anticorrelated Noise Injection for Improved Generalization
Antonio Orvieto
Hans Kersting
F. Proske
Francis R. Bach
Aurelien Lucchi
53
44
0
06 Feb 2022
Comparative assessment of federated and centralized machine learning
Ibrahim Abdul Majeed
Sagar Kaushik
Aniruddha Bardhan
Venkata Siva Kumar Tadi
Hwang-Ki Min
K. Kumaraguru
Rajasekhara Reddy Duvvuru Muni
FedML
22
6
0
03 Feb 2022
Improving Sample Efficiency of Value Based Models Using Attention and Vision Transformers
Amir Ardalan Kalantari
Mohammad Amini
Sarath Chandar
Doina Precup
49
4
0
01 Feb 2022
When Do Flat Minima Optimizers Work?
Jean Kaddour
Linqing Liu
Ricardo M. A. Silva
Matt J. Kusner
ODL
24
58
0
01 Feb 2022
Memory-Efficient Backpropagation through Large Linear Layers
Daniel Bershatsky
A. Mikhalev
A. Katrutsa
Julia Gusak
D. Merkulov
Ivan V. Oseledets
16
4
0
31 Jan 2022
On the Power-Law Hessian Spectrums in Deep Learning
Zeke Xie
Qian-Yuan Tang
Yunfeng Cai
Mingming Sun
P. Li
ODL
42
9
0
31 Jan 2022
Hyperparameter Optimization for COVID-19 Chest X-Ray Classification
I. Hamdi
Muhammad Ridzuan
Mohammad Yaqub
LM&MA
116
0
0
26 Jan 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape
Devansh Bisla
Jing Wang
A. Choromańska
25
34
0
20 Jan 2022
Neighborhood Region Smoothing Regularization for Finding Flat Minima In Deep Neural Networks
Yang Zhao
Hao Zhang
22
1
0
16 Jan 2022
In Defense of the Unitary Scalarization for Deep Multi-Task Learning
Vitaly Kurin
Alessandro De Palma
Ilya Kostrikov
Shimon Whiteson
M. P. Kumar
39
73
0
11 Jan 2022
Class-Incremental Continual Learning into the eXtended DER-verse
Matteo Boschini
Lorenzo Bonicelli
Pietro Buzzega
Angelo Porrello
Simone Calderara
CLL
BDL
29
128
0
03 Jan 2022
Stochastic Weight Averaging Revisited
Hao Guo
Jiyong Jin
B. Liu
29
29
0
03 Jan 2022
Distributed Hybrid CPU and GPU training for Graph Neural Networks on Billion-Scale Graphs
Da Zheng
Xiang Song
Chengrun Yang
Dominique LaSalle
George Karypis
3DH
GNN
32
56
0
31 Dec 2021
DRF Codes: Deep SNR-Robust Feedback Codes
Mahdi Boloursaz Mashhadi
Deniz Gunduz
A. Perotti
B. Popović
19
10
0
22 Dec 2021
HarmoFL: Harmonizing Local and Global Drifts in Federated Learning on Heterogeneous Medical Images
Meirui Jiang
Zirui Wang
Qi Dou
FedML
30
123
0
20 Dec 2021
Sharpness-Aware Minimization with Dynamic Reweighting
Wenxuan Zhou
Fangyu Liu
Huan Zhang
Muhao Chen
AAML
19
8
0
16 Dec 2021
Visualizing the Loss Landscape of Winning Lottery Tickets
Robert Bain
UQCV
27
3
0
16 Dec 2021
Previous
1
2
3
4
5
6
...
9
10
11
Next