Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1901.01672
Cited By
Generalization in Deep Networks: The Role of Distance from Initialization
7 January 2019
Vaishnavh Nagarajan
J. Zico Kolter
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Generalization in Deep Networks: The Role of Distance from Initialization"
50 / 75 papers shown
Title
Just One Layer Norm Guarantees Stable Extrapolation
Juliusz Ziomek
George Whittle
Michael A. Osborne
9
0
0
20 May 2025
A Near Complete Nonasymptotic Generalization Theory For Multilayer Neural Networks: Beyond the Bias-Variance Tradeoff
Hao Yu
Xiangyang Ji
AI4CE
60
0
0
03 Mar 2025
Is network fragmentation a useful complexity measure?
Coenraad Mouton
Randle Rabe
Daniël G. Haasbroek
Marthinus W. Theunissen
Hermanus L. Potgieter
Marelie Hattingh Davel
209
0
0
07 Nov 2024
What Does Softmax Probability Tell Us about Classifiers Ranking Across Diverse Test Conditions?
Weijie Tu
Weijian Deng
Liang Zheng
Tom Gedeon
44
0
0
14 Jun 2024
How many samples are needed to train a deep neural network?
Pegah Golestaneh
Mahsa Taheri
Johannes Lederer
34
4
0
26 May 2024
Generalization Measures for Zero-Shot Cross-Lingual Transfer
Saksham Bassi
Duygu Ataman
Kyunghyun Cho
29
0
0
24 Apr 2024
LNPT: Label-free Network Pruning and Training
Jinying Xiao
Ping Li
Zhe Tang
Jie Nie
38
2
0
19 Mar 2024
On the Diminishing Returns of Width for Continual Learning
E. Guha
V. Lakshman
CLL
39
4
0
11 Mar 2024
Stable Distillation: Regularizing Continued Pre-training for Low-Resource Automatic Speech Recognition
Ashish Seth
Sreyan Ghosh
S. Umesh
Dinesh Manocha
CLL
40
1
0
20 Dec 2023
The Pursuit of Human Labeling: A New Perspective on Unsupervised Learning
Artyom Gadetsky
Maria Brbić
22
6
0
06 Nov 2023
Understanding prompt engineering may not require rethinking generalization
Victor Akinwande
Yiding Jiang
Dylan Sam
J. Zico Kolter
VLM
VPVLM
123
7
0
06 Oct 2023
Fantastic Generalization Measures are Nowhere to be Found
Michael C. Gastpar
Ido Nachum
Jonathan Shafer
T. Weinberger
34
13
0
24 Sep 2023
Zero-Shot Neural Architecture Search: Challenges, Solutions, and Opportunities
Guihong Li
Duc-Tuong Hoang
Kartikeya Bhardwaj
Ming Lin
Zhangyang Wang
R. Marculescu
46
11
0
05 Jul 2023
Cramming: Training a Language Model on a Single GPU in One Day
Jonas Geiping
Tom Goldstein
MoE
32
86
0
28 Dec 2022
Task Discovery: Finding the Tasks that Neural Networks Generalize on
Andrei Atanov
Andrei Filatov
Teresa Yeo
Ajay Sohmshetty
Amir Zamir
OOD
50
10
0
01 Dec 2022
Accelerated Riemannian Optimization: Handling Constraints with a Prox to Bound Geometric Penalties
David Martínez-Rubio
Sebastian Pokutta
20
9
0
26 Nov 2022
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal States
Ziqiao Wang
Yongyi Mao
30
10
0
19 Nov 2022
On the Sample Complexity of Two-Layer Networks: Lipschitz vs. Element-Wise Lipschitz Activation
Amit Daniely
Elad Granot
MLT
32
1
0
17 Nov 2022
Self-Distillation for Further Pre-training of Transformers
Seanie Lee
Minki Kang
Juho Lee
Sung Ju Hwang
Kenji Kawaguchi
49
8
0
30 Sep 2022
Why neural networks find simple solutions: the many regularizers of geometric complexity
Benoit Dherin
Michael Munn
M. Rosca
David Barrett
57
31
0
27 Sep 2022
Intersection of Parallels as an Early Stopping Criterion
Ali Vardasbi
Maarten de Rijke
Mostafa Dehghani
MoMe
41
5
0
19 Aug 2022
Predicting Out-of-Domain Generalization with Neighborhood Invariance
Nathan Ng
Neha Hulkund
Kyunghyun Cho
Marzyeh Ghassemi
OOD
27
4
0
05 Jul 2022
Sparse Double Descent: Where Network Pruning Aggravates Overfitting
Zhengqi He
Zeke Xie
Quanzhi Zhu
Zengchang Qin
83
27
0
17 Jun 2022
Robust Fine-Tuning of Deep Neural Networks with Hessian-based Generalization Guarantees
Haotian Ju
Dongyue Li
Hongyang R. Zhang
45
28
0
06 Jun 2022
Generalization Bounds for Gradient Methods via Discrete and Continuous Prior
Jun Yu Li
Xu Luo
Jian Li
27
4
0
27 May 2022
Analyzing Lottery Ticket Hypothesis from PAC-Bayesian Theory Perspective
Keitaro Sakamoto
Issei Sato
41
9
0
15 May 2022
Investigating Generalization by Controlling Normalized Margin
Alexander R. Farhang
Jeremy Bernstein
Kushal Tirumala
Yang Liu
Yisong Yue
33
6
0
08 May 2022
Predicting the generalization gap in neural networks using topological data analysis
Rubén Ballester
Xavier Arnal Clemente
Carles Casacuberta
Meysam Madadi
C. Corneanu
Sergio Escalera
41
3
0
23 Mar 2022
The activity-weight duality in feed forward neural networks: The geometric determinants of generalization
Yu Feng
Yuhai Tu
MLT
85
14
0
21 Mar 2022
On Measuring Excess Capacity in Neural Networks
Florian Graf
Sebastian Zeng
Bastian Alexander Rieck
Marc Niethammer
Roland Kwitt
27
10
0
16 Feb 2022
Weight Expansion: A New Perspective on Dropout and Generalization
Gao Jin
Xinping Yi
Pengfei Yang
Lijun Zhang
S. Schewe
Xiaowei Huang
29
5
0
23 Jan 2022
In Search of Probeable Generalization Measures
Jonathan Jaegerman
Khalil Damouni
M. M. Ankaralı
Konstantinos N. Plataniotis
27
2
0
23 Oct 2021
Cascaded Compressed Sensing Networks: A Reversible Architecture for Layerwise Learning
Weizhi Lu
Mingrui Chen
Kai Guo
Weiyu Li
21
0
0
20 Oct 2021
Exploring the Common Principal Subspace of Deep Features in Neural Networks
Haoran Liu
Haoyi Xiong
Yaqing Wang
Haozhe An
Dongrui Wu
Dejing Dou
13
1
0
06 Oct 2021
AdjointNet: Constraining machine learning models with physics-based codes
S. Karra
B. Ahmmed
M. Mudunuru
AI4CE
PINN
OOD
24
4
0
08 Sep 2021
Training Algorithm Matters for the Performance of Neural Network Potential: A Case Study of Adam and the Kalman Filter Optimizers
Yunqi Shao
Florian M. Dietrich
Carl Nettelblad
Chao Zhang
14
10
0
08 Sep 2021
Learning an Explicit Hyperparameter Prediction Function Conditioned on Tasks
Jun Shu
Deyu Meng
Zongben Xu
37
6
0
06 Jul 2021
A Theoretical Analysis of Fine-tuning with Linear Teachers
Gal Shachaf
Alon Brutzkus
Amir Globerson
34
17
0
04 Jul 2021
Assessing Generalization of SGD via Disagreement
Yiding Jiang
Vaishnavh Nagarajan
Christina Baek
J. Zico Kolter
67
109
0
25 Jun 2021
Practical Assessment of Generalization Performance Robustness for Deep Networks via Contrastive Examples
Xuanyu Wu
Xuhong Li
Haoyi Xiong
Xiao Zhang
Siyu Huang
Dejing Dou
21
1
0
20 Jun 2021
Measuring Generalization with Optimal Transport
Ching-Yao Chuang
Youssef Mroueh
Kristjan Greenewald
Antonio Torralba
Stefanie Jegelka
OT
46
25
0
07 Jun 2021
Robustness to Pruning Predicts Generalization in Deep Neural Networks
Lorenz Kuhn
Clare Lyle
Aidan Gomez
Jonas Rothfuss
Y. Gal
43
14
0
10 Mar 2021
Cockpit: A Practical Debugging Tool for the Training of Deep Neural Networks
Frank Schneider
Felix Dangel
Philipp Hennig
45
10
0
12 Feb 2021
Physics-informed neural networks with hard constraints for inverse design
Lu Lu
R. Pestourie
Wenjie Yao
Zhicheng Wang
F. Verdugo
Steven G. Johnson
PINN
50
495
0
09 Feb 2021
Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent
Kangqiao Liu
Liu Ziyin
Masakuni Ueda
MLT
61
37
0
07 Dec 2020
Global Riemannian Acceleration in Hyperbolic and Spherical Spaces
David Martínez-Rubio
40
19
0
07 Dec 2020
Understanding the Failure Modes of Out-of-Distribution Generalization
Vaishnavh Nagarajan
Anders Andreassen
Behnam Neyshabur
OOD
OODD
19
176
0
29 Oct 2020
How does Weight Correlation Affect the Generalisation Ability of Deep Neural Networks
Gao Jin
Xinping Yi
Liang Zhang
Lijun Zhang
S. Schewe
Xiaowei Huang
11
40
0
12 Oct 2020
Understanding the Role of Adversarial Regularization in Supervised Learning
Litu Rout
18
3
0
01 Oct 2020
Why Adversarial Interaction Creates Non-Homogeneous Patterns: A Pseudo-Reaction-Diffusion Model for Turing Instability
Litu Rout
AAML
11
1
0
01 Oct 2020
1
2
Next