Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1902.01996
Cited By
Are All Layers Created Equal?
6 February 2019
Chiyuan Zhang
Samy Bengio
Y. Singer
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Are All Layers Created Equal?"
34 / 34 papers shown
Title
Adapting Newton's Method to Neural Networks through a Summary of Higher-Order Derivatives
Pierre Wolinski
ODL
29
0
0
06 Dec 2023
Reset It and Forget It: Relearning Last-Layer Weights Improves Continual and Transfer Learning
Lapo Frati
Neil Traft
Jeff Clune
Nick Cheney
CLL
21
0
0
12 Oct 2023
Iterative Magnitude Pruning as a Renormalisation Group: A Study in The Context of The Lottery Ticket Hypothesis
Abu-Al Hassan
25
0
0
06 Aug 2023
Layer-wise Linear Mode Connectivity
Linara Adilova
Maksym Andriushchenko
Michael Kamp
Asja Fischer
Martin Jaggi
FedML
FAtt
MoMe
33
15
0
13 Jul 2023
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
Jean Kaddour
Oscar Key
Piotr Nawrot
Pasquale Minervini
Matt J. Kusner
20
41
0
12 Jul 2023
On the Lipschitz Constant of Deep Networks and Double Descent
Matteo Gamba
Hossein Azizpour
Marten Bjorkman
25
7
0
28 Jan 2023
Leveraging Unlabeled Data to Track Memorization
Mahsa Forouzesh
Hanie Sedghi
Patrick Thiran
NoLa
TDI
34
3
0
08 Dec 2022
Learning to Generate Image Embeddings with User-level Differential Privacy
Zheng Xu
Maxwell D. Collins
Yuxiao Wang
Liviu Panait
Sewoong Oh
S. Augenstein
Ting Liu
Florian Schroff
H. B. McMahan
FedML
30
29
0
20 Nov 2022
A Law of Data Separation in Deep Learning
Hangfeng He
Weijie J. Su
OOD
24
36
0
31 Oct 2022
Surgical Fine-Tuning Improves Adaptation to Distribution Shifts
Yoonho Lee
Annie S. Chen
Fahim Tajwar
Ananya Kumar
Huaxiu Yao
Percy Liang
Chelsea Finn
OOD
56
197
0
20 Oct 2022
Git Re-Basin: Merging Models modulo Permutation Symmetries
Samuel K. Ainsworth
J. Hayase
S. Srinivasa
MoMe
252
314
0
11 Sep 2022
A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental Learning
Da-Wei Zhou
Qiwen Wang
Han-Jia Ye
De-Chuan Zhan
23
122
0
26 May 2022
The Primacy Bias in Deep Reinforcement Learning
Evgenii Nikishin
Max Schwarzer
P. DÓro
Pierre-Luc Bacon
Aaron C. Courville
OnRL
93
180
0
16 May 2022
Token Dropping for Efficient BERT Pretraining
Le Hou
Richard Yuanzhe Pang
Tianyi Zhou
Yuexin Wu
Xinying Song
Xiaodan Song
Denny Zhou
22
42
0
24 Mar 2022
What Makes Transfer Learning Work For Medical Images: Feature Reuse & Other Factors
Christos Matsoukas
Johan Fredin Haslum
Moein Sorkhei
Magnus P Soderberg
Kevin Smith
VLM
OOD
MedIm
27
85
0
02 Mar 2022
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization
Aviral Kumar
Rishabh Agarwal
Tengyu Ma
Aaron Courville
George Tucker
Sergey Levine
OffRL
31
65
0
09 Dec 2021
Compare Where It Matters: Using Layer-Wise Regularization To Improve Federated Learning on Heterogeneous Data
Ha Min Son
M. Kim
T. Chung
FedML
14
9
0
01 Dec 2021
Visualizing the Emergence of Intermediate Visual Patterns in DNNs
Mingjie Li
Shaobo Wang
Quanshi Zhang
18
11
0
05 Nov 2021
Partial Variable Training for Efficient On-Device Federated Learning
Tien-Ju Yang
Dhruv Guliani
F. Beaufays
Giovanni Motta
FedML
19
25
0
11 Oct 2021
Enabling On-Device Training of Speech Recognition Models with Federated Dropout
Dhruv Guliani
Lillian Zhou
Changwan Ryu
Tien-Ju Yang
Harry Zhang
Yong Xiao
F. Beaufays
Giovanni Motta
FedML
27
16
0
07 Oct 2021
Efficient and Private Federated Learning with Partially Trainable Networks
Hakim Sidahmed
Zheng Xu
Ankush Garg
Yuan Cao
Mingqing Chen
FedML
49
13
0
06 Oct 2021
What can linear interpolation of neural network loss landscapes tell us?
Tiffany J. Vlaar
Jonathan Frankle
MoMe
22
27
0
30 Jun 2021
Randomness In Neural Network Training: Characterizing The Impact of Tooling
Donglin Zhuang
Xingyao Zhang
S. Song
Sara Hooker
25
75
0
22 Jun 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
271
2,603
0
04 May 2021
Experiments with Rich Regime Training for Deep Learning
Xinyan Li
A. Banerjee
29
2
0
26 Feb 2021
Reservoir Transformers
Sheng Shen
Alexei Baevski
Ari S. Morcos
Kurt Keutzer
Michael Auli
Douwe Kiela
35
17
0
30 Dec 2020
FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training
Y. Fu
Haoran You
Yang Katie Zhao
Yue Wang
Chaojian Li
K. Gopalakrishnan
Zhangyang Wang
Yingyan Lin
MQ
30
32
0
24 Dec 2020
A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization
Adepu Ravi Sankar
Yash Khasbage
Rahul Vigneswaran
V. Balasubramanian
25
41
0
07 Dec 2020
Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics
V. Ramasesh
Ethan Dyer
M. Raghu
CLL
24
173
0
14 Jul 2020
Predicting Neural Network Accuracy from Weights
Thomas Unterthiner
Daniel Keysers
Sylvain Gelly
Olivier Bousquet
Ilya O. Tolstikhin
22
101
0
26 Feb 2020
Layerwise Noise Maximisation to Train Low-Energy Deep Neural Networks
Sébastien Henwood
François Leduc-Primeau
Yvon Savaria
23
10
0
23 Dec 2019
Fast Hardware-Aware Neural Architecture Search
Li Lyna Zhang
Yuqing Yang
Yuhang Jiang
Wenwu Zhu
Yunxin Liu
3DV
20
0
0
25 Oct 2019
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
281
2,889
0
15 Sep 2016
Benefits of depth in neural networks
Matus Telgarsky
142
602
0
14 Feb 2016
1