ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.12694
  4. Cited By
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More
  Compressible Models
v1v2 (latest)

Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models

25 May 2022
Clara Na
Sanket Vaibhav Mehta
Emma Strubell
ArXiv (abs)PDFHTML

Papers citing "Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models"

16 / 66 papers shown
Title
Non-Vacuous Generalization Bounds at the ImageNet Scale: A PAC-Bayesian
  Compression Approach
Non-Vacuous Generalization Bounds at the ImageNet Scale: A PAC-Bayesian Compression Approach
Wenda Zhou
Victor Veitch
Morgane Austern
Ryan P. Adams
Peter Orbanz
75
215
0
16 Apr 2018
Averaging Weights Leads to Wider Optima and Better Generalization
Averaging Weights Leads to Wider Optima and Better Generalization
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
FedMLMoMe
135
1,669
0
14 Mar 2018
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Jonathan Frankle
Michael Carbin
242
3,484
0
09 Mar 2018
Quantization and Training of Neural Networks for Efficient
  Integer-Arithmetic-Only Inference
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
Benoit Jacob
S. Kligys
Bo Chen
Menglong Zhu
Matthew Tang
Andrew G. Howard
Hartwig Adam
Dmitry Kalenichenko
MQ
156
3,138
0
15 Dec 2017
Learning Sparse Neural Networks through $L_0$ Regularization
Learning Sparse Neural Networks through L0L_0L0​ Regularization
Christos Louizos
Max Welling
Diederik P. Kingma
433
1,147
0
04 Dec 2017
SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and
  Cross-lingual Focused Evaluation
SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation
Daniel Cer
Mona T. Diab
Eneko Agirre
I. Lopez-Gazpio
Lucia Specia
430
1,887
0
31 Jul 2017
Towards Understanding Generalization of Deep Learning: Perspective of
  Loss Landscapes
Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes
Lei Wu
Zhanxing Zhu
E. Weinan
ODL
64
221
0
30 Jun 2017
A Broad-Coverage Challenge Corpus for Sentence Understanding through
  Inference
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference
Adina Williams
Nikita Nangia
Samuel R. Bowman
524
4,492
0
18 Apr 2017
Sharp Minima Can Generalize For Deep Nets
Sharp Minima Can Generalize For Deep Nets
Laurent Dinh
Razvan Pascanu
Samy Bengio
Yoshua Bengio
ODL
122
774
0
15 Mar 2017
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys
Pratik Chaudhari
A. Choromańska
Stefano Soatto
Yann LeCun
Carlo Baldassi
C. Borgs
J. Chayes
Levent Sagun
R. Zecchina
ODL
96
774
0
06 Nov 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
427
2,945
0
15 Sep 2016
SQuAD: 100,000+ Questions for Machine Comprehension of Text
SQuAD: 100,000+ Questions for Machine Comprehension of Text
Pranav Rajpurkar
Jian Zhang
Konstantin Lopyrev
Percy Liang
RALM
292
8,160
0
16 Jun 2016
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained
  Quantization and Huffman Coding
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Song Han
Huizi Mao
W. Dally
3DGS
261
8,854
0
01 Oct 2015
Learning both Weights and Connections for Efficient Neural Networks
Learning both Weights and Connections for Efficient Neural Networks
Song Han
Jeff Pool
J. Tran
W. Dally
CVBM
313
6,694
0
08 Jun 2015
Distilling the Knowledge in a Neural Network
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
362
19,723
0
09 Mar 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.9K
150,260
0
22 Dec 2014
Previous
12