ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.01312
  4. Cited By
Learning Sparse Neural Networks through $L_0$ Regularization

Learning Sparse Neural Networks through L0L_0L0​ Regularization

4 December 2017
Christos Louizos
Max Welling
Diederik P. Kingma
ArXivPDFHTML

Papers citing "Learning Sparse Neural Networks through $L_0$ Regularization"

50 / 221 papers shown
Title
Super-fast rates of convergence for Neural Networks Classifiers under the Hard Margin Condition
Super-fast rates of convergence for Neural Networks Classifiers under the Hard Margin Condition
Nathanael Tepakbong
Ding-Xuan Zhou
Xiang Zhou
48
0
0
13 May 2025
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
Siyuan Mu
Sen Lin
MoE
186
2
0
10 Mar 2025
MaskPrune: Mask-based LLM Pruning for Layer-wise Uniform Structures
MaskPrune: Mask-based LLM Pruning for Layer-wise Uniform Structures
Jiayu Qin
Jianchao Tan
Kaipeng Zhang
Xunliang Cai
Wei Wang
45
0
0
19 Feb 2025
Advancing Weight and Channel Sparsification with Enhanced Saliency
Advancing Weight and Channel Sparsification with Enhanced Saliency
Xinglong Sun
Maying Shen
Hongxu Yin
Lei Mao
Pavlo Molchanov
Jose M. Alvarez
60
1
0
05 Feb 2025
Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries
Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries
Chris Kolb
T. Weber
Bernd Bischl
David Rügamer
117
0
0
04 Feb 2025
Playing the Lottery With Concave Regularizers for Sparse Trainable Neural Networks
Playing the Lottery With Concave Regularizers for Sparse Trainable Neural Networks
Giulia Fracastoro
Sophie M. Fosson
Andrea Migliorati
G. Calafiore
47
1
0
19 Jan 2025
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models
Philipp Mondorf
Sondre Wold
Yun Xue
43
0
0
02 Oct 2024
Evaluating Model Robustness Using Adaptive Sparse L0 Regularization
Evaluating Model Robustness Using Adaptive Sparse L0 Regularization
Weiyou Liu
Zhenyang Li
Weitong Chen
AAML
30
1
0
28 Aug 2024
Mask in the Mirror: Implicit Sparsification
Mask in the Mirror: Implicit Sparsification
Tom Jacobs
R. Burkholz
52
3
0
19 Aug 2024
Isomorphic Pruning for Vision Models
Isomorphic Pruning for Vision Models
Gongfan Fang
Xinyin Ma
Michael Bi Mi
Xinchao Wang
VLM
ViT
42
6
0
05 Jul 2024
Finding Transformer Circuits with Edge Pruning
Finding Transformer Circuits with Edge Pruning
Adithya Bhaskar
Alexander Wettig
Dan Friedman
Danqi Chen
68
17
0
24 Jun 2024
Geometric sparsification in recurrent neural networks
Geometric sparsification in recurrent neural networks
Wyatt Mackey
Ioannis Schizas
Jared Deighton
David L. Boothe, Jr.
Vasileios Maroulas
38
0
0
10 Jun 2024
Towards Understanding Task-agnostic Debiasing Through the Lenses of
  Intrinsic Bias and Forgetfulness
Towards Understanding Task-agnostic Debiasing Through the Lenses of Intrinsic Bias and Forgetfulness
Guangliang Liu
Milad Afshari
Xitong Zhang
Zhiyu Xue
Avrajit Ghosh
Bidhan Bashyal
Rongrong Wang
K. Johnson
32
0
0
06 Jun 2024
S3D: A Simple and Cost-Effective Self-Speculative Decoding Scheme for
  Low-Memory GPUs
S3D: A Simple and Cost-Effective Self-Speculative Decoding Scheme for Low-Memory GPUs
Wei Zhong
Manasa Bharadwaj
51
5
0
30 May 2024
A separability-based approach to quantifying generalization: which layer
  is best?
A separability-based approach to quantifying generalization: which layer is best?
Luciano Dyballa
Evan Gerritz
Steven W. Zucker
OOD
41
3
0
02 May 2024
AdaFSNet: Time Series Classification Based on Convolutional Network with
  a Adaptive and Effective Kernel Size Configuration
AdaFSNet: Time Series Classification Based on Convolutional Network with a Adaptive and Effective Kernel Size Configuration
Haoxiao Wang
Bo Peng
Jianhua Zhang
Xu Cheng
AI4TS
44
1
0
28 Apr 2024
The Simpler The Better: An Entropy-Based Importance Metric To Reduce
  Neural Networks' Depth
The Simpler The Better: An Entropy-Based Importance Metric To Reduce Neural Networks' Depth
Victor Quétu
Zhu Liao
Enzo Tartaglione
51
4
0
27 Apr 2024
Where does In-context Translation Happen in Large Language Models
Where does In-context Translation Happen in Large Language Models
Suzanna Sia
David Mueller
Kevin Duh
LRM
43
0
0
07 Mar 2024
Always-Sparse Training by Growing Connections with Guided Stochastic Exploration
Always-Sparse Training by Growing Connections with Guided Stochastic Exploration
Mike Heddes
Narayan Srinivasa
T. Givargis
Alexandru Nicolau
91
0
0
12 Jan 2024
The LLM Surgeon
The LLM Surgeon
Tycho F. A. van der Ouderaa
Markus Nagel
M. V. Baalen
Yuki Markus Asano
Tijmen Blankevoort
41
14
0
28 Dec 2023
Shedding the Bits: Pushing the Boundaries of Quantization with
  Minifloats on FPGAs
Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs
Shivam Aggarwal
Hans Jakob Damsgaard
Alessandro Pappalardo
Giuseppe Franco
Thomas B. Preußer
Michaela Blott
Tulika Mitra
MQ
27
6
0
21 Nov 2023
End-to-end Feature Selection Approach for Learning Skinny Trees
End-to-end Feature Selection Approach for Learning Skinny Trees
Shibal Ibrahim
Kayhan Behdin
Rahul Mazumder
30
0
0
28 Oct 2023
Identifying and Adapting Transformer-Components Responsible for Gender
  Bias in an English Language Model
Identifying and Adapting Transformer-Components Responsible for Gender Bias in an English Language Model
Abhijith Chintam
Rahel Beloch
Willem H. Zuidema
Michael Hanna
Oskar van der Wal
28
16
0
19 Oct 2023
f-Divergence Minimization for Sequence-Level Knowledge Distillation
f-Divergence Minimization for Sequence-Level Knowledge Distillation
Yuqiao Wen
Zichao Li
Wenyu Du
Lili Mou
37
53
0
27 Jul 2023
Quantizable Transformers: Removing Outliers by Helping Attention Heads
  Do Nothing
Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
23
87
0
22 Jun 2023
A Simple and Effective Pruning Approach for Large Language Models
A Simple and Effective Pruning Approach for Large Language Models
Mingjie Sun
Zhuang Liu
Anna Bair
J. Zico Kolter
90
361
0
20 Jun 2023
LoSparse: Structured Compression of Large Language Models based on
  Low-Rank and Sparse Approximation
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation
Yixiao Li
Yifan Yu
Qingru Zhang
Chen Liang
Pengcheng He
Weizhu Chen
Tuo Zhao
44
69
0
20 Jun 2023
Spatial Re-parameterization for N:M Sparsity
Spatial Re-parameterization for N:M Sparsity
Yuxin Zhang
Mingbao Lin
Mingliang Xu
Yonghong Tian
Rongrong Ji
46
2
0
09 Jun 2023
Adaptive Sparsity Level during Training for Efficient Time Series
  Forecasting with Transformers
Adaptive Sparsity Level during Training for Efficient Time Series Forecasting with Transformers
Zahra Atashgahi
Mykola Pechenizkiy
Raymond N. J. Veldhuis
Decebal Constantin Mocanu
AI4TS
AI4CE
34
1
0
28 May 2023
GC-Flow: A Graph-Based Flow Network for Effective Clustering
GC-Flow: A Graph-Based Flow Network for Effective Clustering
Tianchun Wang
F. Mirzazadeh
Xinming Zhang
Jing Chen
BDL
50
7
0
26 May 2023
How do languages influence each other? Studying cross-lingual data
  sharing during LM fine-tuning
How do languages influence each other? Studying cross-lingual data sharing during LM fine-tuning
Rochelle Choenni
Dan Garrette
Ekaterina Shutova
42
16
0
22 May 2023
Discovering Causal Relations and Equations from Data
Discovering Causal Relations and Equations from Data
Gustau Camps-Valls
Andreas Gerhardus
Urmi Ninad
Gherardo Varando
Georg Martius
E. Balaguer-Ballester
Ricardo Vinuesa
Emiliano Díaz
L. Zanna
Jakob Runge
PINN
AI4Cl
AI4CE
CML
51
75
0
21 May 2023
Consistent Multi-Granular Rationale Extraction for Explainable Multi-hop
  Fact Verification
Consistent Multi-Granular Rationale Extraction for Explainable Multi-hop Fact Verification
Jiasheng Si
Yingjie Zhu
Deyu Zhou
AAML
54
3
0
16 May 2023
SPADE: Sparse Pillar-based 3D Object Detection Accelerator for
  Autonomous Driving
SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous Driving
Minjae Lee
Seongmin Park
Hyung-Se Kim
Minyong Yoon
Jangwhan Lee
Junwon Choi
Nam Sung Kim
Mingu Kang
Jungwook Choi
3DPC
26
5
0
12 May 2023
Evil from Within: Machine Learning Backdoors through Hardware Trojans
Evil from Within: Machine Learning Backdoors through Hardware Trojans
Alexander Warnecke
Julian Speith
Janka Möller
Konrad Rieck
C. Paar
AAML
26
3
0
17 Apr 2023
VISION DIFFMASK: Faithful Interpretation of Vision Transformers with
  Differentiable Patch Masking
VISION DIFFMASK: Faithful Interpretation of Vision Transformers with Differentiable Patch Masking
A. Nalmpantis
Apostolos Panagiotopoulos
John Gkountouras
Konstantinos Papakostas
Wilker Aziz
15
4
0
13 Apr 2023
Surrogate Lagrangian Relaxation: A Path To Retrain-free Deep Neural
  Network Pruning
Surrogate Lagrangian Relaxation: A Path To Retrain-free Deep Neural Network Pruning
Shangli Zhou
Mikhail A. Bragin
Lynn Pepin
Deniz Gurevin
Fei Miao
Caiwen Ding
29
3
0
08 Apr 2023
NTK-SAP: Improving neural network pruning by aligning training dynamics
NTK-SAP: Improving neural network pruning by aligning training dynamics
Yite Wang
Dawei Li
Ruoyu Sun
44
19
0
06 Apr 2023
Learning Sparsity of Representations with Discrete Latent Variables
Learning Sparsity of Representations with Discrete Latent Variables
Zhao Xu
Daniel Oñoro-Rubio
G. Serra
Mathias Niepert
13
0
0
03 Apr 2023
Illuminati: Towards Explaining Graph Neural Networks for Cybersecurity
  Analysis
Illuminati: Towards Explaining Graph Neural Networks for Cybersecurity Analysis
Haoyu He
Yuede Ji
H. H. Huang
27
21
0
26 Mar 2023
Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive
  Structured Pruning
Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning
Sung-Feng Huang
Chia-Ping Chen
Zhi-Sheng Chen
Yu-Pao Tsai
Hung-yi Lee
38
3
0
21 Mar 2023
Memorization Capacity of Neural Networks with Conditional Computation
Memorization Capacity of Neural Networks with Conditional Computation
Erdem Koyuncu
38
4
0
20 Mar 2023
Induced Feature Selection by Structured Pruning
Induced Feature Selection by Structured Pruning
Nathan Hubens
V. Delvigne
M. Mancas
B. Gosselin
Marius Preda
T. Zaharia
22
0
0
20 Mar 2023
Efficient Computation Sharing for Multi-Task Visual Scene Understanding
Efficient Computation Sharing for Multi-Task Visual Scene Understanding
Sara Shoouri
Mingyu Yang
Zichen Fan
Hun-Seok Kim
MoE
31
3
0
16 Mar 2023
A High-Performance Accelerator for Super-Resolution Processing on
  Embedded GPU
A High-Performance Accelerator for Super-Resolution Processing on Embedded GPU
W. Zhao
Qi Sun
Yang Bai
Wenbo Li
Haisheng Zheng
Bei Yu
Martin D. F. Wong
SupR
47
8
0
16 Mar 2023
On Model Compression for Neural Networks: Framework, Algorithm, and
  Convergence Guarantee
On Model Compression for Neural Networks: Framework, Algorithm, and Convergence Guarantee
Chenyang Li
Jihoon Chung
Mengnan Du
Haimin Wang
Xianlian Zhou
Bohao Shen
33
1
0
13 Mar 2023
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Shiwei Liu
Tianlong Chen
Zhenyu Zhang
Xuxi Chen
Tianjin Huang
Ajay Jaiswal
Zhangyang Wang
37
29
0
03 Mar 2023
DSD$^2$: Can We Dodge Sparse Double Descent and Compress the Neural
  Network Worry-Free?
DSD2^22: Can We Dodge Sparse Double Descent and Compress the Neural Network Worry-Free?
Victor Quétu
Enzo Tartaglione
37
7
0
02 Mar 2023
Balanced Training for Sparse GANs
Balanced Training for Sparse GANs
Yite Wang
Jing Wu
N. Hovakimyan
Ruoyu Sun
50
9
0
28 Feb 2023
Structured Pruning of Self-Supervised Pre-trained Models for Speech
  Recognition and Understanding
Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding
Yifan Peng
Kwangyoun Kim
Felix Wu
Prashant Sridhar
Shinji Watanabe
37
34
0
27 Feb 2023
12345
Next