ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.08190
  4. Cited By
Sparse Progressive Distillation: Resolving Overfitting under
  Pretrain-and-Finetune Paradigm

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm

15 October 2021
Shaoyi Huang
Dongkuan Xu
Ian En-Hsu Yen
Yijue Wang
Sung-En Chang
Bingbing Li
Shiyang Chen
Mimi Xie
Sanguthevar Rajasekaran
Hang Liu
Caiwen Ding
    CLL
    VLM
ArXivPDFHTML

Papers citing "Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm"

11 / 11 papers shown
Title
Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution
  Networks
Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution Networks
Xiaoru Xie
Hongwu Peng
Amit Hasan
Shaoyi Huang
Jiahui Zhao
Haowen Fang
Wei Zhang
Tong Geng
O. Khan
Caiwen Ding
GNN
38
30
0
22 Aug 2023
AutoReP: Automatic ReLU Replacement for Fast Private Network Inference
AutoReP: Automatic ReLU Replacement for Fast Private Network Inference
Hongwu Peng
Shaoyi Huang
Tong Zhou
Yukui Luo
Chenghong Wang
...
Tony Geng
Kaleel Mahmood
Wujie Wen
Xiaolin Xu
Caiwen Ding
OffRL
47
38
0
20 Aug 2023
Dynamic Sparse Training via Balancing the Exploration-Exploitation
  Trade-off
Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off
Shaoyi Huang
Bowen Lei
Dongkuan Xu
Hongwu Peng
Yue Sun
Mimi Xie
Caiwen Ding
29
19
0
30 Nov 2022
Efficient Traffic State Forecasting using Spatio-Temporal Network
  Dependencies: A Sparse Graph Neural Network Approach
Efficient Traffic State Forecasting using Spatio-Temporal Network Dependencies: A Sparse Graph Neural Network Approach
Bin Lei
Shaoyi Huang
Caiwen Ding
Monika Filipovska
GNN
AI4TS
14
0
0
06 Nov 2022
PROD: Progressive Distillation for Dense Retrieval
PROD: Progressive Distillation for Dense Retrieval
Zhenghao Lin
Yeyun Gong
Xiao Liu
Hang Zhang
Chen Lin
...
Jian Jiao
Jing Lu
Daxin Jiang
Rangan Majumder
Nan Duan
51
27
0
27 Sep 2022
Towards Sparsification of Graph Neural Networks
Towards Sparsification of Graph Neural Networks
Hongwu Peng
Deniz Gurevin
Shaoyi Huang
Tong Geng
Weiwen Jiang
O. Khan
Caiwen Ding
GNN
30
24
0
11 Sep 2022
Structured Pruning Learns Compact and Accurate Models
Structured Pruning Learns Compact and Accurate Models
Mengzhou Xia
Zexuan Zhong
Danqi Chen
VLM
11
180
0
01 Apr 2022
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Zhangyang Wang
Michael Carbin
156
377
0
23 Jul 2020
What is the State of Neural Network Pruning?
What is the State of Neural Network Pruning?
Davis W. Blalock
Jose Javier Gonzalez Ortiz
Jonathan Frankle
John Guttag
191
1,032
0
06 Mar 2020
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
Canwen Xu
Wangchunshu Zhou
Tao Ge
Furu Wei
Ming Zhou
229
198
0
07 Feb 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
1