Are All Layers Created Equal?

6 February 2019

Papers citing "Are All Layers Created Equal?"

34 / 34 papers shown

Title
Adapting Newton's Method to Neural Networks through a Summary of Higher-Order Derivatives Pierre Wolinski ODL 29 0 0 06 Dec 2023
Reset It and Forget It: Relearning Last-Layer Weights Improves Continual and Transfer Learning Lapo Frati Neil Traft Jeff Clune Nick Cheney CLL 21 0 0 12 Oct 2023
Iterative Magnitude Pruning as a Renormalisation Group: A Study in The Context of The Lottery Ticket Hypothesis Abu-Al Hassan 25 0 0 06 Aug 2023
Layer-wise Linear Mode Connectivity Linara Adilova Maksym Andriushchenko Michael Kamp Asja Fischer Martin Jaggi FedML FAtt MoMe 33 15 0 13 Jul 2023
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models Jean Kaddour Oscar Key Piotr Nawrot Pasquale Minervini Matt J. Kusner 20 41 0 12 Jul 2023
On the Lipschitz Constant of Deep Networks and Double Descent Matteo Gamba Hossein Azizpour Marten Bjorkman 25 7 0 28 Jan 2023
Leveraging Unlabeled Data to Track Memorization Mahsa Forouzesh Hanie Sedghi Patrick Thiran NoLa TDI 34 3 0 08 Dec 2022
Learning to Generate Image Embeddings with User-level Differential Privacy Zheng Xu Maxwell D. Collins Yuxiao Wang Liviu Panait Sewoong Oh S. Augenstein Ting Liu Florian Schroff H. B. McMahan FedML 30 29 0 20 Nov 2022
A Law of Data Separation in Deep Learning Hangfeng He Weijie J. Su OOD 24 36 0 31 Oct 2022
Surgical Fine-Tuning Improves Adaptation to Distribution Shifts Yoonho Lee Annie S. Chen Fahim Tajwar Ananya Kumar Huaxiu Yao Percy Liang Chelsea Finn OOD 56 197 0 20 Oct 2022
Git Re-Basin: Merging Models modulo Permutation Symmetries Samuel K. Ainsworth J. Hayase S. Srinivasa MoMe 252 314 0 11 Sep 2022
A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental Learning Da-Wei Zhou Qiwen Wang Han-Jia Ye De-Chuan Zhan 23 122 0 26 May 2022
The Primacy Bias in Deep Reinforcement Learning Evgenii Nikishin Max Schwarzer P. DÓro Pierre-Luc Bacon Aaron C. Courville OnRL 93 180 0 16 May 2022
Token Dropping for Efficient BERT Pretraining Le Hou Richard Yuanzhe Pang Tianyi Zhou Yuexin Wu Xinying Song Xiaodan Song Denny Zhou 22 42 0 24 Mar 2022
What Makes Transfer Learning Work For Medical Images: Feature Reuse & Other Factors Christos Matsoukas Johan Fredin Haslum Moein Sorkhei Magnus P Soderberg Kevin Smith VLM OOD MedIm 27 85 0 02 Mar 2022
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization Aviral Kumar Rishabh Agarwal Tengyu Ma Aaron Courville George Tucker Sergey Levine OffRL 31 65 0 09 Dec 2021
Compare Where It Matters: Using Layer-Wise Regularization To Improve Federated Learning on Heterogeneous Data Ha Min Son M. Kim T. Chung FedML 14 9 0 01 Dec 2021
Visualizing the Emergence of Intermediate Visual Patterns in DNNs Mingjie Li Shaobo Wang Quanshi Zhang 18 11 0 05 Nov 2021
Partial Variable Training for Efficient On-Device Federated Learning Tien-Ju Yang Dhruv Guliani F. Beaufays Giovanni Motta FedML 19 25 0 11 Oct 2021
Enabling On-Device Training of Speech Recognition Models with Federated Dropout Dhruv Guliani Lillian Zhou Changwan Ryu Tien-Ju Yang Harry Zhang Yong Xiao F. Beaufays Giovanni Motta FedML 27 16 0 07 Oct 2021
Efficient and Private Federated Learning with Partially Trainable Networks Hakim Sidahmed Zheng Xu Ankush Garg Yuan Cao Mingqing Chen FedML 49 13 0 06 Oct 2021
What can linear interpolation of neural network loss landscapes tell us? Tiffany J. Vlaar Jonathan Frankle MoMe 22 27 0 30 Jun 2021
Randomness In Neural Network Training: Characterizing The Impact of Tooling Donglin Zhuang Xingyao Zhang S. Song Sara Hooker 25 75 0 22 Jun 2021
MLP-Mixer: An all-MLP Architecture for Vision Ilya O. Tolstikhin N. Houlsby Alexander Kolesnikov Lucas Beyer Xiaohua Zhai ... Andreas Steiner Daniel Keysers Jakob Uszkoreit Mario Lucic Alexey Dosovitskiy 271 2,603 0 04 May 2021
Experiments with Rich Regime Training for Deep Learning Xinyan Li A. Banerjee 29 2 0 26 Feb 2021
Reservoir Transformers Sheng Shen Alexei Baevski Ari S. Morcos Kurt Keutzer Michael Auli Douwe Kiela 35 17 0 30 Dec 2020
FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training Y. Fu Haoran You Yang Katie Zhao Yue Wang Chaojian Li K. Gopalakrishnan Zhangyang Wang Yingyan Lin MQ 30 32 0 24 Dec 2020
A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization Adepu Ravi Sankar Yash Khasbage Rahul Vigneswaran V. Balasubramanian 25 41 0 07 Dec 2020
Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics V. Ramasesh Ethan Dyer M. Raghu CLL 24 173 0 14 Jul 2020
Predicting Neural Network Accuracy from Weights Thomas Unterthiner Daniel Keysers Sylvain Gelly Olivier Bousquet Ilya O. Tolstikhin 22 101 0 26 Feb 2020
Layerwise Noise Maximisation to Train Low-Energy Deep Neural Networks Sébastien Henwood François Leduc-Primeau Yvon Savaria 23 10 0 23 Dec 2019
Fast Hardware-Aware Neural Architecture Search Li Lyna Zhang Yuqing Yang Yuhang Jiang Wenwu Zhu Yunxin Liu 3DV 20 0 0 25 Oct 2019
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 281 2,889 0 15 Sep 2016
Benefits of depth in neural networks Matus Telgarsky 142 602 0 14 Feb 2016