Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.05802
Cited By
How many degrees of freedom do we need to train deep networks: a loss landscape perspective
13 July 2021
Brett W. Larsen
Stanislav Fort
Nico Becker
Surya Ganguli
UQCV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"How many degrees of freedom do we need to train deep networks: a loss landscape perspective"
19 / 19 papers shown
Title
Parameter-Efficient Fine-Tuning of Large Language Models using Semantic Knowledge Tuning
Nusrat Jahan Prottasha
Asif Mahmud
Md. Shohanur Islam Sobuj
Prakash Bhat
Md. Kowsher
Niloofar Yousefi
O. Garibay
40
4
0
11 Oct 2024
Propulsion: Steering LLM with Tiny Fine-Tuning
Md. Kowsher
Nusrat Jahan Prottasha
Prakash Bhat
51
4
0
17 Sep 2024
Memory-Efficient LLM Training with Online Subspace Descent
Kaizhao Liang
Bo Liu
Lizhang Chen
Qiang Liu
29
7
0
23 Aug 2024
VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections
Roy Miles
Pradyumna Reddy
Ismail Elezi
Jiankang Deng
VLM
48
3
0
28 May 2024
LoQT: Low Rank Adapters for Quantized Training
Sebastian Loeschcke
M. Toftrup
M. Kastoryano
Serge Belongie
Vésteinn Snæbjarnarson
MQ
42
3
0
26 May 2024
Insights into the Lottery Ticket Hypothesis and Iterative Magnitude Pruning
Tausifa Jan Saleem
Ramanjit Ahuja
Surendra Prasad
Brejesh Lall
31
0
0
22 Mar 2024
ECToNAS: Evolutionary Cross-Topology Neural Architecture Search
Elisabeth J. Schiessler
R. Aydin
C. Cyron
42
0
0
08 Mar 2024
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Jiawei Zhao
Zhenyu Zhang
Beidi Chen
Zhangyang Wang
A. Anandkumar
Yuandong Tian
50
179
0
06 Mar 2024
Identifying Policy Gradient Subspaces
Jan Schneider-Barnes
Pierre Schumacher
Simon Guist
Tianyu Cui
Daniel Haeufle
Bernhard Scholkopf
Le Chen
49
5
0
12 Jan 2024
Detecting Toxic Flow
Álvaro Cartea
Gerardo Duran-Martin
Leandro Sánchez-Betancourt
27
7
0
10 Dec 2023
How Sparse Can We Prune A Deep Network: A Fundamental Limit Viewpoint
Qiaozhe Zhang
Rui-qi Zhang
Jun Sun
Yingzhuang Liu
26
0
0
09 Jun 2023
Sparse Weight Averaging with Multiple Particles for Iterative Magnitude Pruning
Moonseok Choi
Hyungi Lee
G. Nam
Juho Lee
45
2
0
24 May 2023
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal States
Ziqiao Wang
Yongyi Mao
35
10
0
19 Nov 2022
Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models
Cheng Ma
Yang Liu
Jiankang Deng
Lingxi Xie
Weiming Dong
Changsheng Xu
VLM
VPVLM
48
44
0
04 Nov 2022
What does a deep neural network confidently perceive? The effective dimension of high certainty class manifolds and their low confidence boundaries
Stanislav Fort
E. D. Cubuk
Surya Ganguli
S. Schoenholz
25
5
0
11 Oct 2022
Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?
Mansheej Paul
F. Chen
Brett W. Larsen
Jonathan Frankle
Surya Ganguli
Gintare Karolina Dziugaite
UQCV
47
38
0
06 Oct 2022
Few-Shot Learning by Dimensionality Reduction in Gradient Space
M. Gauch
M. Beck
Thomas Adler
D. Kotsur
Stefan Fiel
...
Markus Holzleitner
Werner Zellinger
D. Klotz
Sepp Hochreiter
Sebastian Lehner
51
9
0
07 Jun 2022
Efficient Online Bayesian Inference for Neural Bandits
Gerardo Duran-Martín
Aleyna Kara
Kevin Patrick Murphy
BDL
32
13
0
01 Dec 2021
The large learning rate phase of deep learning: the catapult mechanism
Aitor Lewkowycz
Yasaman Bahri
Ethan Dyer
Jascha Narain Sohl-Dickstein
Guy Gur-Ari
ODL
159
236
0
04 Mar 2020
1