Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2108.00259
Cited By
v1
v2
v3 (latest)
How much pre-training is enough to discover a good subnetwork?
31 July 2021
Cameron R. Wolfe
Fangshuo Liao
Qihan Wang
Junhyung Lyle Kim
Anastasios Kyrillidis
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"How much pre-training is enough to discover a good subnetwork?"
50 / 64 papers shown
Title
PERP: Rethinking the Prune-Retrain Paradigm in the Era of LLMs
Max Zimmer
Megi Andoni
Christoph Spiegel
Sebastian Pokutta
VLM
143
10
0
23 Dec 2023
Aiming towards the minimizers: fast convergence of SGD for overparametrized problems
Chaoyue Liu
Dmitriy Drusvyatskiy
M. Belkin
Damek Davis
Yi-An Ma
ODL
77
18
0
05 Jun 2023
Strong Lottery Ticket Hypothesis with
ε
\varepsilon
ε
--perturbation
Zheyang Xiong
Fangshuo Liao
Anastasios Kyrillidis
56
2
0
29 Oct 2022
Subquadratic Overparameterization for Shallow Neural Networks
Chaehwan Song
Ali Ramezani-Kebrya
Thomas Pethick
Armin Eftekhari
Volkan Cevher
76
31
0
02 Nov 2021
Pruning and Quantization for Deep Neural Network Acceleration: A Survey
Tailin Liang
C. Glossner
Lei Wang
Shaobo Shi
Xiaotong Zhang
MQ
231
701
0
24 Jan 2021
On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths
Quynh N. Nguyen
122
49
0
24 Jan 2021
EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets
Xiaohan Chen
Yu Cheng
Shuohang Wang
Zhe Gan
Zhangyang Wang
Jingjing Liu
110
100
0
31 Dec 2020
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks
Quynh N. Nguyen
Marco Mondelli
Guido Montúfar
78
83
0
21 Dec 2020
Provable Benefits of Overparameterization in Model Compression: From Double Descent to Pruning Neural Networks
Xiangyu Chang
Yingcong Li
Samet Oymak
Christos Thrampoulidis
68
51
0
16 Dec 2020
The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Michael Carbin
Zhangyang Wang
68
123
0
12 Dec 2020
The Lottery Ticket Hypothesis for Object Recognition
Sharath Girish
Shishira R. Maiya
Kamal Gupta
Hao Chen
L. Davis
Abhinav Shrivastava
138
61
0
08 Dec 2020
Greedy Optimization Provably Wins the Lottery: Logarithmic Number of Winning Tickets is Enough
Mao Ye
Lemeng Wu
Qiang Liu
61
17
0
29 Oct 2020
Deep Neural Network Training with Frank-Wolfe
Sebastian Pokutta
Christoph Spiegel
Max Zimmer
68
27
0
14 Oct 2020
Gradient Flow in Sparse Neural Networks and How Lottery Tickets Win
Utku Evci
Yani Andrew Ioannou
Cem Keskin
Yann N. Dauphin
56
94
0
07 Oct 2020
Pruning Neural Networks at Initialization: Why are We Missing the Mark?
Jonathan Frankle
Gintare Karolina Dziugaite
Daniel M. Roy
Michael Carbin
67
240
0
18 Sep 2020
Logarithmic Pruning is All You Need
Laurent Orseau
Marcus Hutter
Omar Rivasplata
87
89
0
22 Jun 2020
Optimal Lottery Tickets via SubsetSum: Logarithmic Over-Parameterization is Sufficient
Ankit Pensia
Shashank Rajput
Alliot Nagle
Harit Vishwakarma
Dimitris Papailiopoulos
60
104
0
14 Jun 2020
A Mean-field Analysis of Deep ResNet and Beyond: Towards Provable Optimization Via Overparameterization From Depth
Yiping Lu
Chao Ma
Yulong Lu
Jianfeng Lu
Lexing Ying
MLT
153
79
0
11 Mar 2020
What is the State of Neural Network Pruning?
Davis W. Blalock
Jose Javier Gonzalez Ortiz
Jonathan Frankle
John Guttag
280
1,054
0
06 Mar 2020
Comparing Rewinding and Fine-tuning in Neural Network Pruning
Alex Renda
Jonathan Frankle
Michael Carbin
304
388
0
05 Mar 2020
Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection
Mao Ye
Chengyue Gong
Lizhen Nie
Denny Zhou
Adam R. Klivans
Qiang Liu
84
111
0
03 Mar 2020
Loss landscapes and optimization in over-parameterized non-linear systems and neural networks
Chaoyue Liu
Libin Zhu
M. Belkin
ODL
96
265
0
29 Feb 2020
On Layer Normalization in the Transformer Architecture
Ruibin Xiong
Yunchang Yang
Di He
Kai Zheng
Shuxin Zheng
Chen Xing
Huishuai Zhang
Yanyan Lan
Liwei Wang
Tie-Yan Liu
AI4CE
153
998
0
12 Feb 2020
Proving the Lottery Ticket Hypothesis: Pruning is All You Need
Eran Malach
Gilad Yehudai
Shai Shalev-Shwartz
Ohad Shamir
112
276
0
03 Feb 2020
What's Hidden in a Randomly Weighted Neural Network?
Vivek Ramanujan
Mitchell Wortsman
Aniruddha Kembhavi
Ali Farhadi
Mohammad Rastegari
66
361
0
29 Nov 2019
Rigging the Lottery: Making All Tickets Winners
Utku Evci
Trevor Gale
Jacob Menick
Pablo Samuel Castro
Erich Elsen
199
607
0
25 Nov 2019
SiPPing Neural Networks: Sensitivity-informed Provable Pruning of Neural Networks
Cenk Baykal
Lucas Liebenwein
Igor Gilitschenski
Dan Feldman
Daniela Rus
70
18
0
11 Oct 2019
Finite Depth and Width Corrections to the Neural Tangent Kernel
Boris Hanin
Mihai Nica
MDE
79
152
0
13 Sep 2019
One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers
Ari S. Morcos
Haonan Yu
Michela Paganini
Yuandong Tian
79
229
0
06 Jun 2019
Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask
Hattie Zhou
Janice Lan
Rosanne Liu
J. Yosinski
UQCV
71
389
0
03 May 2019
The State of Sparsity in Deep Neural Networks
Trevor Gale
Erich Elsen
Sara Hooker
165
763
0
25 Feb 2019
Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit
Song Mei
Theodor Misiakiewicz
Andrea Montanari
MLT
84
279
0
16 Feb 2019
Towards moderate overparameterization: global convergence guarantees for training shallow neural networks
Samet Oymak
Mahdi Soltanolkotabi
61
323
0
12 Feb 2019
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruosong Wang
MLT
223
974
0
24 Jan 2019
Training Neural Networks with Local Error Signals
Arild Nøkland
L. Eidnes
105
228
0
20 Jan 2019
Greedy Layerwise Learning Can Scale to ImageNet
Eugene Belilovsky
Michael Eickenberg
Edouard Oyallon
130
181
0
29 Dec 2018
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
MLT
201
775
0
12 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
Jason D. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
240
1,136
0
09 Nov 2018
Discrimination-aware Channel Pruning for Deep Neural Networks
Zhuangwei Zhuang
Mingkui Tan
Bohan Zhuang
Jing Liu
Yong Guo
Qingyao Wu
Junzhou Huang
Jin-Hui Zhu
134
601
0
28 Oct 2018
Rethinking the Value of Network Pruning
Zhuang Liu
Mingjie Sun
Tinghui Zhou
Gao Huang
Trevor Darrell
42
1,477
0
11 Oct 2018
Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data
Yuanzhi Li
Yingyu Liang
MLT
222
653
0
03 Aug 2018
Learning ReLU Networks via Alternating Minimization
Gauri Jagatap
Chinmay Hegde
40
11
0
20 Jun 2018
Learning One-hidden-layer ReLU Networks via Gradient Descent
Xiao Zhang
Yaodong Yu
Lingxiao Wang
Quanquan Gu
MLT
129
135
0
20 Jun 2018
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Arthur Jacot
Franck Gabriel
Clément Hongler
277
3,225
0
20 Jun 2018
On Tighter Generalization Bound for Deep Neural Networks: CNNs, ResNets, and Beyond
Xingguo Li
Junwei Lu
Zhaoran Wang
Jarvis Haupt
T. Zhao
57
80
0
13 Jun 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks
Song Mei
Andrea Montanari
Phan-Minh Nguyen
MLT
109
863
0
18 Apr 2018
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Jonathan Frankle
Michael Carbin
277
3,489
0
09 Mar 2018
Learning One Convolutional Layer with Overlapping Patches
Surbhi Goel
Adam R. Klivans
Raghu Meka
MLT
80
81
0
07 Feb 2018
MobileNetV2: Inverted Residuals and Linear Bottlenecks
Mark Sandler
Andrew G. Howard
Menglong Zhu
A. Zhmoginov
Liang-Chieh Chen
218
19,353
0
13 Jan 2018
NISP: Pruning Networks using Neuron Importance Score Propagation
Ruichi Yu
Ang Li
Chun-Fu Chen
Jui-Hsin Lai
Vlad I. Morariu
Xintong Han
M. Gao
Ching-Yung Lin
L. Davis
74
800
0
16 Nov 2017
1
2
Next