How many degrees of freedom do we need to train deep networks: a loss
landscape perspective

How many degrees of freedom do we need to train deep networks: a loss landscape perspective

13 July 2021

Brett W. Larsen

Papers citing "How many degrees of freedom do we need to train deep networks: a loss landscape perspective"

19 / 19 papers shown

Title
Parameter-Efficient Fine-Tuning of Large Language Models using Semantic Knowledge Tuning Nusrat Jahan Prottasha Asif Mahmud Md. Shohanur Islam Sobuj Prakash Bhat Md. Kowsher Niloofar Yousefi O. Garibay 40 4 0 11 Oct 2024
Propulsion: Steering LLM with Tiny Fine-Tuning Md. Kowsher Nusrat Jahan Prottasha Prakash Bhat 51 4 0 17 Sep 2024
Memory-Efficient LLM Training with Online Subspace Descent Kaizhao Liang Bo Liu Lizhang Chen Qiang Liu 29 7 0 23 Aug 2024
VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections Roy Miles Pradyumna Reddy Ismail Elezi Jiankang Deng VLM 48 3 0 28 May 2024
LoQT: Low Rank Adapters for Quantized Training Sebastian Loeschcke M. Toftrup M. Kastoryano Serge Belongie Vésteinn Snæbjarnarson MQ 42 3 0 26 May 2024
Insights into the Lottery Ticket Hypothesis and Iterative Magnitude Pruning Tausifa Jan Saleem Ramanjit Ahuja Surendra Prasad Brejesh Lall 31 0 0 22 Mar 2024
ECToNAS: Evolutionary Cross-Topology Neural Architecture Search Elisabeth J. Schiessler R. Aydin C. Cyron 42 0 0 08 Mar 2024
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Jiawei Zhao Zhenyu Zhang Beidi Chen Zhangyang Wang A. Anandkumar Yuandong Tian 50 179 0 06 Mar 2024
Identifying Policy Gradient Subspaces Jan Schneider-Barnes Pierre Schumacher Simon Guist Tianyu Cui Daniel Haeufle Bernhard Scholkopf Le Chen 49 5 0 12 Jan 2024
Detecting Toxic Flow Álvaro Cartea Gerardo Duran-Martin Leandro Sánchez-Betancourt 27 7 0 10 Dec 2023
How Sparse Can We Prune A Deep Network: A Fundamental Limit Viewpoint Qiaozhe Zhang Rui-qi Zhang Jun Sun Yingzhuang Liu 26 0 0 09 Jun 2023
Sparse Weight Averaging with Multiple Particles for Iterative Magnitude Pruning Moonseok Choi Hyungi Lee G. Nam Juho Lee 45 2 0 24 May 2023
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal States Ziqiao Wang Yongyi Mao 35 10 0 19 Nov 2022
Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models Cheng Ma Yang Liu Jiankang Deng Lingxi Xie Weiming Dong Changsheng Xu VLM VPVLM 48 44 0 04 Nov 2022
What does a deep neural network confidently perceive? The effective dimension of high certainty class manifolds and their low confidence boundaries Stanislav Fort E. D. Cubuk Surya Ganguli S. Schoenholz 25 5 0 11 Oct 2022
Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask? Mansheej Paul F. Chen Brett W. Larsen Jonathan Frankle Surya Ganguli Gintare Karolina Dziugaite UQCV 47 38 0 06 Oct 2022
Few-Shot Learning by Dimensionality Reduction in Gradient Space M. Gauch M. Beck Thomas Adler D. Kotsur Stefan Fiel ... Markus Holzleitner Werner Zellinger D. Klotz Sepp Hochreiter Sebastian Lehner 51 9 0 07 Jun 2022
Efficient Online Bayesian Inference for Neural Bandits Gerardo Duran-Martín Aleyna Kara Kevin Patrick Murphy BDL 32 13 0 01 Dec 2021
The large learning rate phase of deep learning: the catapult mechanism Aitor Lewkowycz Yasaman Bahri Ethan Dyer Jascha Narain Sohl-Dickstein Guy Gur-Ari ODL 159 236 0 04 Mar 2020