Inconsistency, Instability, and Generalization Gap of Deep Neural
Network Training

Inconsistency, Instability, and Generalization Gap of Deep Neural Network Training

31 May 2023

Tong Zhang

Papers citing "Inconsistency, Instability, and Generalization Gap of Deep Neural Network Training"

5 / 5 papers shown

Title
NeuralGrok: Accelerate Grokking by Neural Gradient Transformation Xinyu Zhou Simin Fan Martin Jaggi Jie Fu 46 0 0 24 Apr 2025
MLP-Mixer: An all-MLP Architecture for Vision Ilya O. Tolstikhin N. Houlsby Alexander Kolesnikov Lucas Beyer Xiaohua Zhai ... Andreas Steiner Daniel Keysers Jakob Uszkoreit Mario Lucic Alexey Dosovitskiy 315 2,623 0 04 May 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 316 7,044 0 20 Apr 2018
Large scale distributed neural network training through online distillation Rohan Anil Gabriel Pereyra Alexandre Passos Róbert Ormándi George E. Dahl Geoffrey E. Hinton FedML 285 404 0 09 Apr 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 318 2,908 0 15 Sep 2016