ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.03517
  4. Cited By
Top-KAST: Top-K Always Sparse Training

Top-KAST: Top-K Always Sparse Training

7 June 2021
Siddhant M. Jayakumar
Razvan Pascanu
Jack W. Rae
Simon Osindero
Erich Elsen
ArXivPDFHTML

Papers citing "Top-KAST: Top-K Always Sparse Training"

32 / 32 papers shown
Title
Differential Privacy Personalized Federated Learning Based on Dynamically Sparsified Client Updates
Differential Privacy Personalized Federated Learning Based on Dynamically Sparsified Client Updates
Chuanyin Wang
Yifei Zhang
Neng Gao
Qiang Luo
FedML
73
0
0
12 Mar 2025
E2ENet: Dynamic Sparse Feature Fusion for Accurate and Efficient 3D Medical Image Segmentation
E2ENet: Dynamic Sparse Feature Fusion for Accurate and Efficient 3D Medical Image Segmentation
Boqian Wu
Q. Xiao
Shiwei Liu
Lu Yin
Mykola Pechenizkiy
Decebal Constantin Mocanu
M. V. Keulen
Elena Mocanu
MedIm
65
4
0
20 Feb 2025
Advancing Weight and Channel Sparsification with Enhanced Saliency
Advancing Weight and Channel Sparsification with Enhanced Saliency
Xinglong Sun
Maying Shen
Hongxu Yin
Lei Mao
Pavlo Molchanov
Jose M. Alvarez
58
1
0
05 Feb 2025
Brain-inspired sparse training enables Transformers and LLMs to perform as fully connected
Brain-inspired sparse training enables Transformers and LLMs to perform as fully connected
Yingtao Zhang
Jialin Zhao
Wenjing Wu
Ziheng Liao
Umberto Michieli
C. Cannistraci
58
0
0
31 Jan 2025
Pushing the Limits of Sparsity: A Bag of Tricks for Extreme Pruning
Pushing the Limits of Sparsity: A Bag of Tricks for Extreme Pruning
Andy Li
A. Durrant
Milan Markovic
Lu Yin
Georgios Leontidis
Tianlong Chen
Lu Yin
Georgios Leontidis
80
0
0
20 Nov 2024
Dynamic Sparse Training versus Dense Training: The Unexpected Winner in Image Corruption Robustness
Dynamic Sparse Training versus Dense Training: The Unexpected Winner in Image Corruption Robustness
Boqian Wu
Q. Xiao
Shunxin Wang
N. Strisciuglio
Mykola Pechenizkiy
M. V. Keulen
Decebal Constantin Mocanu
Elena Mocanu
OOD
3DH
63
0
0
03 Oct 2024
Embracing Unknown Step by Step: Towards Reliable Sparse Training in Real
  World
Embracing Unknown Step by Step: Towards Reliable Sparse Training in Real World
Bowen Lei
Dongkuan Xu
Ruqi Zhang
Bani Mallick
UQCV
49
0
0
29 Mar 2024
Always-Sparse Training by Growing Connections with Guided Stochastic Exploration
Always-Sparse Training by Growing Connections with Guided Stochastic Exploration
Mike Heddes
Narayan Srinivasa
T. Givargis
Alexandru Nicolau
91
0
0
12 Jan 2024
Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs
Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs
Yuxin Zhang
Lirui Zhao
Mingbao Lin
Yunyun Sun
Yiwu Yao
Xingjia Han
Jared Tanner
Shiwei Liu
Rongrong Ji
SyDa
45
40
0
13 Oct 2023
Adaptive Sparsity Level during Training for Efficient Time Series
  Forecasting with Transformers
Adaptive Sparsity Level during Training for Efficient Time Series Forecasting with Transformers
Zahra Atashgahi
Mykola Pechenizkiy
Raymond N. J. Veldhuis
Decebal Constantin Mocanu
AI4TS
AI4CE
34
1
0
28 May 2023
Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning
  and Coding with LLMs
Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning and Coding with LLMs
Pranjal Aggarwal
Aman Madaan
Yiming Yang
Mausam
LRM
41
38
0
19 May 2023
Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training
  Efficiency
Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training Efficiency
Vithursan Thangarasa
Shreyas Saxena
Abhay Gupta
Sean Lie
41
3
0
21 Mar 2023
A Unified Framework for Soft Threshold Pruning
A Unified Framework for Soft Threshold Pruning
Yanqing Chen
Zhengyu Ma
Wei Fang
Xiawu Zheng
Zhaofei Yu
Yonghong Tian
88
19
0
25 Feb 2023
SparseProp: Efficient Sparse Backpropagation for Faster Training of
  Neural Networks
SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks
Mahdi Nikdan
Tommaso Pegolotti
Eugenia Iofinova
Eldar Kurtic
Dan Alistarh
26
11
0
09 Feb 2023
Balance is Essence: Accelerating Sparse Training via Adaptive Gradient
  Correction
Balance is Essence: Accelerating Sparse Training via Adaptive Gradient Correction
Bowen Lei
Dongkuan Xu
Ruqi Zhang
Shuren He
Bani Mallick
42
6
0
09 Jan 2023
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation
  Approach
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach
Peng Mi
Li Shen
Tianhe Ren
Yiyi Zhou
Xiaoshuai Sun
Rongrong Ji
Dacheng Tao
AAML
43
69
0
11 Oct 2022
Towards Sparsification of Graph Neural Networks
Towards Sparsification of Graph Neural Networks
Hongwu Peng
Deniz Gurevin
Shaoyi Huang
Tong Geng
Weiwen Jiang
O. Khan
Caiwen Ding
GNN
30
24
0
11 Sep 2022
Safety and Performance, Why not Both? Bi-Objective Optimized Model
  Compression toward AI Software Deployment
Safety and Performance, Why not Both? Bi-Objective Optimized Model Compression toward AI Software Deployment
Jie Zhu
Leye Wang
Xiao Han
35
9
0
11 Aug 2022
Spartan: Differentiable Sparsity via Regularized Transportation
Spartan: Differentiable Sparsity via Regularized Transportation
Kai Sheng Tai
Taipeng Tian
Ser-Nam Lim
34
11
0
27 May 2022
Monarch: Expressive Structured Matrices for Efficient and Accurate
  Training
Monarch: Expressive Structured Matrices for Efficient and Accurate Training
Tri Dao
Beidi Chen
N. Sohoni
Arjun D Desai
Michael Poli
Jessica Grogan
Alexander Liu
Aniruddh Rao
Atri Rudra
Christopher Ré
32
87
0
01 Apr 2022
Sparsity Winning Twice: Better Robust Generalization from More Efficient
  Training
Sparsity Winning Twice: Better Robust Generalization from More Efficient Training
Tianlong Chen
Zhenyu Zhang
Pengju Wang
Santosh Balachandra
Haoyu Ma
Zehao Wang
Zhangyang Wang
OOD
AAML
100
47
0
20 Feb 2022
Achieving Personalized Federated Learning with Sparse Local Models
Achieving Personalized Federated Learning with Sparse Local Models
Tiansheng Huang
Shiwei Liu
Li Shen
Fengxiang He
Weiwei Lin
Dacheng Tao
FedML
38
43
0
27 Jan 2022
How Well Do Sparse Imagenet Models Transfer?
How Well Do Sparse Imagenet Models Transfer?
Eugenia Iofinova
Alexandra Peste
Mark Kurtz
Dan Alistarh
27
38
0
26 Nov 2021
ScaleCert: Scalable Certified Defense against Adversarial Patches with
  Sparse Superficial Layers
ScaleCert: Scalable Certified Defense against Adversarial Patches with Sparse Superficial Layers
Husheng Han
Kaidi Xu
Xing Hu
Xiaobing Chen
Ling Liang
Zidong Du
Qi Guo
Yanzhi Wang
Yunji Chen
AAML
13
19
0
27 Oct 2021
Avoiding Forgetting and Allowing Forward Transfer in Continual Learning
  via Sparse Networks
Avoiding Forgetting and Allowing Forward Transfer in Continual Learning via Sparse Networks
Ghada Sokar
Decebal Constantin Mocanu
Mykola Pechenizkiy
CLL
35
8
0
11 Oct 2021
Powerpropagation: A sparsity inducing weight reparameterisation
Powerpropagation: A sparsity inducing weight reparameterisation
Jonathan Richard Schwarz
Siddhant M. Jayakumar
Razvan Pascanu
P. Latham
Yee Whye Teh
98
54
0
01 Oct 2021
Deep Ensembling with No Overhead for either Training or Testing: The
  All-Round Blessings of Dynamic Sparsity
Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity
Shiwei Liu
Tianlong Chen
Zahra Atashgahi
Xiaohan Chen
Ghada Sokar
Elena Mocanu
Mykola Pechenizkiy
Zhangyang Wang
Decebal Constantin Mocanu
OOD
31
49
0
28 Jun 2021
Sparse Training via Boosting Pruning Plasticity with Neuroregeneration
Sparse Training via Boosting Pruning Plasticity with Neuroregeneration
Shiwei Liu
Tianlong Chen
Xiaohan Chen
Zahra Atashgahi
Lu Yin
Huanyu Kou
Li Shen
Mykola Pechenizkiy
Zhangyang Wang
Decebal Constantin Mocanu
45
112
0
19 Jun 2021
Sparse Training Theory for Scalable and Efficient Agents
Sparse Training Theory for Scalable and Efficient Agents
Decebal Constantin Mocanu
Elena Mocanu
T. Pinto
Selima Curci
Phuong H. Nguyen
M. Gibescu
D. Ernst
Z. Vale
45
17
0
02 Mar 2021
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
264
4,532
0
23 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,836
0
17 Sep 2019
A Brain-inspired Algorithm for Training Highly Sparse Neural Networks
A Brain-inspired Algorithm for Training Highly Sparse Neural Networks
Zahra Atashgahi
Joost Pieterse
Shiwei Liu
Decebal Constantin Mocanu
Raymond N. J. Veldhuis
Mykola Pechenizkiy
35
15
0
17 Mar 2019
1