Distilling Linguistic Context for Language Model Compression

17 September 2021

Papers citing "Distilling Linguistic Context for Language Model Compression"

13 / 13 papers shown

Title
Multi-Level Optimal Transport for Universal Cross-Tokenizer Knowledge Distillation on Language Models Xiao Cui Mo Zhu Yulei Qin Liang Xie Wengang Zhou Haoyang Li 136 6 0 19 Dec 2024
DynaBERT: Dynamic BERT with Adaptive Width and Depth Lu Hou Zhiqi Huang Lifeng Shang Xin Jiang Xiao Chen Qun Liu MQ 68 322 0 08 Apr 2020
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices Zhiqing Sun Hongkun Yu Xiaodan Song Renjie Liu Yiming Yang Denny Zhou MQ 95 807 0 06 Apr 2020
FastBERT: a Self-distilling BERT with Adaptive Inference Time Weijie Liu Peng Zhou Zhe Zhao Zhiruo Wang Haotang Deng Qi Ju 82 356 0 05 Apr 2020
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers Wenhui Wang Furu Wei Li Dong Hangbo Bao Nan Yang Ming Zhou VLM 109 1,230 0 25 Feb 2020
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter Victor Sanh Lysandre Debut Julien Chaumond Thomas Wolf 169 7,437 0 02 Oct 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations Zhenzhong Lan Mingda Chen Sebastian Goodman Kevin Gimpel Piyush Sharma Radu Soricut SSL AIMat 294 6,420 0 26 Sep 2019
TinyBERT: Distilling BERT for Natural Language Understanding Xiaoqi Jiao Yichun Yin Lifeng Shang Xin Jiang Xiao Chen Linlin Li F. Wang Qun Liu VLM 75 1,847 0 23 Sep 2019
What do you learn from context? Probing for sentence structure in contextualized word representations Ian Tenney Patrick Xia Berlin Chen Alex Jinpeng Wang Adam Poliak ... Najoung Kim Benjamin Van Durme Samuel R. Bowman Dipanjan Das Ellie Pavlick 163 853 0 15 May 2019
BERT Rediscovers the Classical NLP Pipeline Ian Tenney Dipanjan Das Ellie Pavlick MILM SSeg 112 1,458 0 15 May 2019
Deep contextualized word representations Matthew E. Peters Mark Neumann Mohit Iyyer Matt Gardner Christopher Clark Kenton Lee Luke Zettlemoyer NAI 147 11,520 0 15 Feb 2018
SQuAD: 100,000+ Questions for Machine Comprehension of Text Pranav Rajpurkar Jian Zhang Konstantin Lopyrev Percy Liang RALM 186 8,067 0 16 Jun 2016
Distributed Representations of Words and Phrases and their Compositionality Tomas Mikolov Ilya Sutskever Kai Chen G. Corrado J. Dean NAI OCL 325 33,445 0 16 Oct 2013