ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2108.00259
  4. Cited By
How much pre-training is enough to discover a good subnetwork?
v1v2v3 (latest)

How much pre-training is enough to discover a good subnetwork?

31 July 2021
Cameron R. Wolfe
Fangshuo Liao
Qihan Wang
Junhyung Lyle Kim
Anastasios Kyrillidis
ArXiv (abs)PDFHTML

Papers citing "How much pre-training is enough to discover a good subnetwork?"

50 / 64 papers shown
Title
PERP: Rethinking the Prune-Retrain Paradigm in the Era of LLMs
PERP: Rethinking the Prune-Retrain Paradigm in the Era of LLMs
Max Zimmer
Megi Andoni
Christoph Spiegel
Sebastian Pokutta
VLM
143
10
0
23 Dec 2023
Aiming towards the minimizers: fast convergence of SGD for
  overparametrized problems
Aiming towards the minimizers: fast convergence of SGD for overparametrized problems
Chaoyue Liu
Dmitriy Drusvyatskiy
M. Belkin
Damek Davis
Yi-An Ma
ODL
77
18
0
05 Jun 2023
Strong Lottery Ticket Hypothesis with $\varepsilon$--perturbation
Strong Lottery Ticket Hypothesis with ε\varepsilonε--perturbation
Zheyang Xiong
Fangshuo Liao
Anastasios Kyrillidis
56
2
0
29 Oct 2022
Subquadratic Overparameterization for Shallow Neural Networks
Subquadratic Overparameterization for Shallow Neural Networks
Chaehwan Song
Ali Ramezani-Kebrya
Thomas Pethick
Armin Eftekhari
Volkan Cevher
76
31
0
02 Nov 2021
Pruning and Quantization for Deep Neural Network Acceleration: A Survey
Pruning and Quantization for Deep Neural Network Acceleration: A Survey
Tailin Liang
C. Glossner
Lei Wang
Shaobo Shi
Xiaotong Zhang
MQ
231
701
0
24 Jan 2021
On the Proof of Global Convergence of Gradient Descent for Deep ReLU
  Networks with Linear Widths
On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths
Quynh N. Nguyen
122
49
0
24 Jan 2021
EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets
EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets
Xiaohan Chen
Yu Cheng
Shuohang Wang
Zhe Gan
Zhangyang Wang
Jingjing Liu
110
100
0
31 Dec 2020
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for
  Deep ReLU Networks
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks
Quynh N. Nguyen
Marco Mondelli
Guido Montúfar
78
83
0
21 Dec 2020
Provable Benefits of Overparameterization in Model Compression: From
  Double Descent to Pruning Neural Networks
Provable Benefits of Overparameterization in Model Compression: From Double Descent to Pruning Neural Networks
Xiangyu Chang
Yingcong Li
Samet Oymak
Christos Thrampoulidis
68
51
0
16 Dec 2020
The Lottery Tickets Hypothesis for Supervised and Self-supervised
  Pre-training in Computer Vision Models
The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Michael Carbin
Zhangyang Wang
68
123
0
12 Dec 2020
The Lottery Ticket Hypothesis for Object Recognition
The Lottery Ticket Hypothesis for Object Recognition
Sharath Girish
Shishira R. Maiya
Kamal Gupta
Hao Chen
L. Davis
Abhinav Shrivastava
138
61
0
08 Dec 2020
Greedy Optimization Provably Wins the Lottery: Logarithmic Number of
  Winning Tickets is Enough
Greedy Optimization Provably Wins the Lottery: Logarithmic Number of Winning Tickets is Enough
Mao Ye
Lemeng Wu
Qiang Liu
61
17
0
29 Oct 2020
Deep Neural Network Training with Frank-Wolfe
Deep Neural Network Training with Frank-Wolfe
Sebastian Pokutta
Christoph Spiegel
Max Zimmer
68
27
0
14 Oct 2020
Gradient Flow in Sparse Neural Networks and How Lottery Tickets Win
Gradient Flow in Sparse Neural Networks and How Lottery Tickets Win
Utku Evci
Yani Andrew Ioannou
Cem Keskin
Yann N. Dauphin
56
94
0
07 Oct 2020
Pruning Neural Networks at Initialization: Why are We Missing the Mark?
Pruning Neural Networks at Initialization: Why are We Missing the Mark?
Jonathan Frankle
Gintare Karolina Dziugaite
Daniel M. Roy
Michael Carbin
67
240
0
18 Sep 2020
Logarithmic Pruning is All You Need
Logarithmic Pruning is All You Need
Laurent Orseau
Marcus Hutter
Omar Rivasplata
87
89
0
22 Jun 2020
Optimal Lottery Tickets via SubsetSum: Logarithmic Over-Parameterization
  is Sufficient
Optimal Lottery Tickets via SubsetSum: Logarithmic Over-Parameterization is Sufficient
Ankit Pensia
Shashank Rajput
Alliot Nagle
Harit Vishwakarma
Dimitris Papailiopoulos
60
104
0
14 Jun 2020
A Mean-field Analysis of Deep ResNet and Beyond: Towards Provable
  Optimization Via Overparameterization From Depth
A Mean-field Analysis of Deep ResNet and Beyond: Towards Provable Optimization Via Overparameterization From Depth
Yiping Lu
Chao Ma
Yulong Lu
Jianfeng Lu
Lexing Ying
MLT
153
79
0
11 Mar 2020
What is the State of Neural Network Pruning?
What is the State of Neural Network Pruning?
Davis W. Blalock
Jose Javier Gonzalez Ortiz
Jonathan Frankle
John Guttag
280
1,054
0
06 Mar 2020
Comparing Rewinding and Fine-tuning in Neural Network Pruning
Comparing Rewinding and Fine-tuning in Neural Network Pruning
Alex Renda
Jonathan Frankle
Michael Carbin
304
388
0
05 Mar 2020
Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection
Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection
Mao Ye
Chengyue Gong
Lizhen Nie
Denny Zhou
Adam R. Klivans
Qiang Liu
84
111
0
03 Mar 2020
Loss landscapes and optimization in over-parameterized non-linear
  systems and neural networks
Loss landscapes and optimization in over-parameterized non-linear systems and neural networks
Chaoyue Liu
Libin Zhu
M. Belkin
ODL
96
265
0
29 Feb 2020
On Layer Normalization in the Transformer Architecture
On Layer Normalization in the Transformer Architecture
Ruibin Xiong
Yunchang Yang
Di He
Kai Zheng
Shuxin Zheng
Chen Xing
Huishuai Zhang
Yanyan Lan
Liwei Wang
Tie-Yan Liu
AI4CE
153
998
0
12 Feb 2020
Proving the Lottery Ticket Hypothesis: Pruning is All You Need
Proving the Lottery Ticket Hypothesis: Pruning is All You Need
Eran Malach
Gilad Yehudai
Shai Shalev-Shwartz
Ohad Shamir
112
276
0
03 Feb 2020
What's Hidden in a Randomly Weighted Neural Network?
What's Hidden in a Randomly Weighted Neural Network?
Vivek Ramanujan
Mitchell Wortsman
Aniruddha Kembhavi
Ali Farhadi
Mohammad Rastegari
66
361
0
29 Nov 2019
Rigging the Lottery: Making All Tickets Winners
Rigging the Lottery: Making All Tickets Winners
Utku Evci
Trevor Gale
Jacob Menick
Pablo Samuel Castro
Erich Elsen
199
607
0
25 Nov 2019
SiPPing Neural Networks: Sensitivity-informed Provable Pruning of Neural
  Networks
SiPPing Neural Networks: Sensitivity-informed Provable Pruning of Neural Networks
Cenk Baykal
Lucas Liebenwein
Igor Gilitschenski
Dan Feldman
Daniela Rus
70
18
0
11 Oct 2019
Finite Depth and Width Corrections to the Neural Tangent Kernel
Finite Depth and Width Corrections to the Neural Tangent Kernel
Boris Hanin
Mihai Nica
MDE
79
152
0
13 Sep 2019
One ticket to win them all: generalizing lottery ticket initializations
  across datasets and optimizers
One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers
Ari S. Morcos
Haonan Yu
Michela Paganini
Yuandong Tian
79
229
0
06 Jun 2019
Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask
Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask
Hattie Zhou
Janice Lan
Rosanne Liu
J. Yosinski
UQCV
71
389
0
03 May 2019
The State of Sparsity in Deep Neural Networks
The State of Sparsity in Deep Neural Networks
Trevor Gale
Erich Elsen
Sara Hooker
165
763
0
25 Feb 2019
Mean-field theory of two-layers neural networks: dimension-free bounds
  and kernel limit
Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit
Song Mei
Theodor Misiakiewicz
Andrea Montanari
MLT
84
279
0
16 Feb 2019
Towards moderate overparameterization: global convergence guarantees for
  training shallow neural networks
Towards moderate overparameterization: global convergence guarantees for training shallow neural networks
Samet Oymak
Mahdi Soltanolkotabi
61
323
0
12 Feb 2019
Fine-Grained Analysis of Optimization and Generalization for
  Overparameterized Two-Layer Neural Networks
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruosong Wang
MLT
223
974
0
24 Jan 2019
Training Neural Networks with Local Error Signals
Training Neural Networks with Local Error Signals
Arild Nøkland
L. Eidnes
105
228
0
20 Jan 2019
Greedy Layerwise Learning Can Scale to ImageNet
Greedy Layerwise Learning Can Scale to ImageNet
Eugene Belilovsky
Michael Eickenberg
Edouard Oyallon
130
181
0
29 Dec 2018
Learning and Generalization in Overparameterized Neural Networks, Going
  Beyond Two Layers
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
MLT
201
775
0
12 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
Jason D. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
240
1,136
0
09 Nov 2018
Discrimination-aware Channel Pruning for Deep Neural Networks
Discrimination-aware Channel Pruning for Deep Neural Networks
Zhuangwei Zhuang
Mingkui Tan
Bohan Zhuang
Jing Liu
Yong Guo
Qingyao Wu
Junzhou Huang
Jin-Hui Zhu
134
601
0
28 Oct 2018
Rethinking the Value of Network Pruning
Rethinking the Value of Network Pruning
Zhuang Liu
Mingjie Sun
Tinghui Zhou
Gao Huang
Trevor Darrell
42
1,477
0
11 Oct 2018
Learning Overparameterized Neural Networks via Stochastic Gradient
  Descent on Structured Data
Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data
Yuanzhi Li
Yingyu Liang
MLT
222
653
0
03 Aug 2018
Learning ReLU Networks via Alternating Minimization
Learning ReLU Networks via Alternating Minimization
Gauri Jagatap
Chinmay Hegde
40
11
0
20 Jun 2018
Learning One-hidden-layer ReLU Networks via Gradient Descent
Learning One-hidden-layer ReLU Networks via Gradient Descent
Xiao Zhang
Yaodong Yu
Lingxiao Wang
Quanquan Gu
MLT
129
135
0
20 Jun 2018
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Arthur Jacot
Franck Gabriel
Clément Hongler
277
3,225
0
20 Jun 2018
On Tighter Generalization Bound for Deep Neural Networks: CNNs, ResNets,
  and Beyond
On Tighter Generalization Bound for Deep Neural Networks: CNNs, ResNets, and Beyond
Xingguo Li
Junwei Lu
Zhaoran Wang
Jarvis Haupt
T. Zhao
57
80
0
13 Jun 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks
A Mean Field View of the Landscape of Two-Layers Neural Networks
Song Mei
Andrea Montanari
Phan-Minh Nguyen
MLT
109
863
0
18 Apr 2018
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Jonathan Frankle
Michael Carbin
277
3,489
0
09 Mar 2018
Learning One Convolutional Layer with Overlapping Patches
Learning One Convolutional Layer with Overlapping Patches
Surbhi Goel
Adam R. Klivans
Raghu Meka
MLT
80
81
0
07 Feb 2018
MobileNetV2: Inverted Residuals and Linear Bottlenecks
MobileNetV2: Inverted Residuals and Linear Bottlenecks
Mark Sandler
Andrew G. Howard
Menglong Zhu
A. Zhmoginov
Liang-Chieh Chen
218
19,353
0
13 Jan 2018
NISP: Pruning Networks using Neuron Importance Score Propagation
NISP: Pruning Networks using Neuron Importance Score Propagation
Ruichi Yu
Ang Li
Chun-Fu Chen
Jui-Hsin Lai
Vlad I. Morariu
Xintong Han
M. Gao
Ching-Yung Lin
L. Davis
74
800
0
16 Nov 2017
12
Next