Stiffness: A New Perspective on Generalization in Neural Networks

28 January 2019

Stanislav Fort

Pawel Krzysztof Nowak

Stanislaw Jastrzebski

S. Narayanan

ArXiv PDF HTML

Papers citing "Stiffness: A New Perspective on Generalization in Neural Networks"

20 / 20 papers shown

Title
Importance Analysis for Dynamic Control of Balancing Parameter in a Simple Knowledge Distillation Setting Seongmin Kim Kwanho Kim Minseung Kim Kanghyun Jo 26 0 0 06 May 2025
Directions of Curvature as an Explanation for Loss of Plasticity Alex Lewandowski Haruto Tanaka Dale Schuurmans Marlos C. Machado 24 5 0 30 Nov 2023
Understanding plasticity in neural networks Clare Lyle Zeyu Zheng Evgenii Nikishin Bernardo Avila-Pires Razvan Pascanu Will Dabney AI4CE 40 98 0 02 Mar 2023
Understanding the Spectral Bias of Coordinate Based MLPs Via Training Dynamics J. Lazzari Xiuwen Liu 26 3 0 14 Jan 2023
Discovering and Explaining the Representation Bottleneck of Graph Neural Networks from Multi-order Interactions Fang Wu Siyuan Li Lirong Wu Dragomir R. Radev Stan Z. Li 27 2 0 15 May 2022
Discovering and Explaining the Representation Bottleneck of DNNs Huiqi Deng Qihan Ren Hao Zhang Quanshi Zhang 39 59 0 11 Nov 2021
Visualizing the Emergence of Intermediate Visual Patterns in DNNs Mingjie Li Shaobo Wang Quanshi Zhang 44 11 0 05 Nov 2021
Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization Alexandre Ramé Corentin Dancette Matthieu Cord OOD 47 205 0 07 Sep 2021
Implicit Gradient Alignment in Distributed and Federated Learning Yatin Dandi Luis Barba Martin Jaggi FedML 26 31 0 25 Jun 2021
Intraclass clustering: an implicit learning ability that regularizes DNNs Simon Carbonnelle Christophe De Vleeschouwer 60 8 0 11 Mar 2021
A Random Matrix Theory Approach to Damping in Deep Learning Diego Granziol Nicholas P. Baskerville AI4CE ODL 29 2 0 15 Nov 2020
A Bayesian Perspective on Training Speed and Model Selection Clare Lyle Lisa Schut Binxin Ru Y. Gal Mark van der Wilk 44 24 0 27 Oct 2020
On Robustness and Bias Analysis of BERT-based Relation Extraction Luoqiu Li Xiang Chen Hongbin Ye Zhen Bi Shumin Deng Ningyu Zhang Huajun Chen 32 18 0 14 Sep 2020
Learning explanations that are hard to vary Giambattista Parascandolo Alexander Neitz Antonio Orvieto Luigi Gresele Bernhard Schölkopf FAtt 27 178 0 01 Sep 2020
Interpreting and Disentangling Feature Components of Various Complexity from DNNs Jie Ren Mingjie Li Zexu Liu Quanshi Zhang CoGe 19 18 0 29 Jun 2020
Speedy Performance Estimation for Neural Architecture Search Binxin Ru Clare Lyle Lisa Schut M. Fil Mark van der Wilk Y. Gal 18 36 0 08 Jun 2020
Coherent Gradients: An Approach to Understanding Generalization in Gradient Descent-based Optimization S. Chatterjee ODL OOD 11 49 0 25 Feb 2020
The Break-Even Point on Optimization Trajectories of Deep Neural Networks Stanislaw Jastrzebski Maciej Szymczak Stanislav Fort Devansh Arpit Jacek Tabor Kyunghyun Cho Krzysztof J. Geras 50 155 0 21 Feb 2020
Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study Jinlan Fu Pengfei Liu Qi Zhang Xuanjing Huang AI4CE 33 73 0 12 Jan 2020
The Local Elasticity of Neural Networks Hangfeng He Weijie J. Su 40 44 0 15 Oct 2019