FastFormers: Highly Efficient Transformer Models for Natural Language
Understanding

FastFormers: Highly Efficient Transformer Models for Natural Language Understanding

26 October 2020

Young Jin Kim

Papers citing "FastFormers: Highly Efficient Transformer Models for Natural Language Understanding"

14 / 14 papers shown

Title
On Importance of Pruning and Distillation for Efficient Low Resource NLP Aishwarya Mirashi Purva Lingayat Srushti Sonavane Tejas Padhiyar Raviraj Joshi Geetanjali Kale 34 1 0 21 Sep 2024
The Unreasonable Ineffectiveness of the Deeper Layers Andrey Gromov Kushal Tirumala Hassan Shapourian Paolo Glorioso Daniel A. Roberts 56 84 0 26 Mar 2024
Confidence Preservation Property in Knowledge Distillation Abstractions Dmitry Vengertsev Elena Sherman 43 0 0 21 Jan 2024
AccelTran: A Sparsity-Aware Accelerator for Dynamic Inference with Transformers Shikhar Tuli N. Jha 36 32 0 28 Feb 2023
HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers Chen Liang Haoming Jiang Zheng Li Xianfeng Tang Bin Yin Tuo Zhao VLM 32 24 0 19 Feb 2023
SAMP: A Model Inference Toolkit of Post-Training Quantization for Text Processing via Self-Adaptive Mixed-Precision Rong Tian Zijing Zhao Weijie Liu Haoyan Liu Weiquan Mao Zhe Zhao Kimmo Yan MQ 22 5 0 19 Sep 2022
Building an Efficiency Pipeline: Commutativity and Cumulativeness of Efficiency Operators for Transformers Ji Xin Raphael Tang Zhiying Jiang Yaoliang Yu Jimmy J. Lin 20 1 0 31 Jul 2022
A Survey on Model Compression and Acceleration for Pretrained Language Models Canwen Xu Julian McAuley 28 58 0 15 Feb 2022
Prune Once for All: Sparse Pre-Trained Language Models Ofir Zafrir Ariel Larey Guy Boudoukh Haihao Shen Moshe Wasserblat VLM 34 82 0 10 Nov 2021
KroneckerBERT: Learning Kronecker Decomposition for Pre-trained Language Models via Knowledge Distillation Marzieh S. Tahaei Ella Charlaix V. Nia A. Ghodsi Mehdi Rezagholizadeh 46 22 0 13 Sep 2021
Block Pruning For Faster Transformers François Lagunas Ella Charlaix Victor Sanh Alexander M. Rush VLM 33 219 0 10 Sep 2021
FNet: Mixing Tokens with Fourier Transforms James Lee-Thorp Joshua Ainslie Ilya Eckstein Santiago Ontanon 47 520 0 09 May 2021
Compression of Deep Learning Models for Text: A Survey Manish Gupta Puneet Agrawal VLM MedIm AI4CE 22 115 0 12 Aug 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT Sheng Shen Zhen Dong Jiayu Ye Linjian Ma Z. Yao A. Gholami Michael W. Mahoney Kurt Keutzer MQ 236 578 0 12 Sep 2019