Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2112.07327
Cited By
Model Uncertainty-Aware Knowledge Amalgamation for Pre-Trained Language Models
14 December 2021
Lei Li
Yankai Lin
Xuancheng Ren
Guangxiang Zhao
Peng Li
Jie Zhou
Xu Sun
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Model Uncertainty-Aware Knowledge Amalgamation for Pre-Trained Language Models"
27 / 27 papers shown
Title
Dynamic Knowledge Distillation for Pre-trained Language Models
Lei Li
Yankai Lin
Shuhuai Ren
Peng Li
Jie Zhou
Xu Sun
70
49
0
23 Sep 2021
One Teacher is Enough? Pre-trained Language Model Distillation from Multiple Teachers
Chuhan Wu
Fangzhao Wu
Yongfeng Huang
48
65
0
02 Jun 2021
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
302
384
0
28 Feb 2021
Towards Debiasing Sentence Representations
Paul Pu Liang
Irene Li
Emily Zheng
Y. Lim
Ruslan Salakhutdinov
Louis-Philippe Morency
70
238
0
16 Jul 2020
Contextualizing Hate Speech Classifiers with Post-hoc Explanation
Brendan Kennedy
Xisen Jin
Aida Mostafazadeh Davani
Morteza Dehghani
Xiang Ren
80
141
0
05 May 2020
Revisiting Pre-Trained Models for Chinese Natural Language Processing
Yiming Cui
Wanxiang Che
Ting Liu
Bing Qin
Shijin Wang
Guoping Hu
77
697
0
29 Apr 2020
Calibration of Pre-trained Transformers
Shrey Desai
Greg Durrett
UQLM
280
300
0
17 Mar 2020
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
Wenhui Wang
Furu Wei
Li Dong
Hangbo Bao
Nan Yang
Ming Zhou
VLM
122
1,260
0
25 Feb 2020
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
373
20,053
0
23 Oct 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
200
7,481
0
02 Oct 2019
TinyBERT: Distilling BERT for Natural Language Understanding
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
92
1,857
0
23 Sep 2019
Patient Knowledge Distillation for BERT Model Compression
S. Sun
Yu Cheng
Zhe Gan
Jingjing Liu
121
836
0
25 Aug 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
522
24,351
0
26 Jul 2019
Green AI
Roy Schwartz
Jesse Dodge
Noah A. Smith
Oren Etzioni
102
1,134
0
22 Jul 2019
Knowledge Amalgamation from Heterogeneous Networks by Common Feature Learning
Sihui Luo
Xinchao Wang
Gongfan Fang
Yao Hu
Dapeng Tao
Xiuming Zhang
MoMe
35
47
0
24 Jun 2019
Measuring Bias in Contextualized Word Representations
Keita Kurita
Nidhi Vyas
Ayush Pareek
A. Black
Yulia Tsvetkov
98
449
0
18 Jun 2019
Counterfactual Data Augmentation for Mitigating Gender Stereotypes in Languages with Rich Morphology
Ran Zmigrod
Sabrina J. Mielke
Hanna M. Wallach
Ryan Cotterell
61
281
0
11 Jun 2019
Energy and Policy Considerations for Deep Learning in NLP
Emma Strubell
Ananya Ganesh
Andrew McCallum
62
2,647
0
05 Jun 2019
Unifying Heterogeneous Classifiers with Distillation
J. Vongkulbhisal
Phongtharin Vinayavekhin
M. V. Scarzanella
42
51
0
12 Apr 2019
Amalgamating Knowledge towards Comprehensive Classification
Chengchao Shen
L. Câlmâc
Mingli Song
Li Sun
Xiuming Zhang
MoMe
55
89
0
07 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.5K
94,511
0
11 Oct 2018
On Calibration of Modern Neural Networks
Chuan Guo
Geoff Pleiss
Yu Sun
Kilian Q. Weinberger
UQCV
262
5,812
0
14 Jun 2017
A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks
Dan Hendrycks
Kevin Gimpel
UQCV
139
3,441
0
07 Oct 2016
Character-level Convolutional Networks for Text Classification
Xiang Zhang
Jiaqi Zhao
Yann LeCun
248
6,101
0
04 Sep 2015
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
Y. Gal
Zoubin Ghahramani
UQCV
BDL
690
9,290
0
06 Jun 2015
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
314
19,609
0
09 Mar 2015
FitNets: Hints for Thin Deep Nets
Adriana Romero
Nicolas Ballas
Samira Ebrahimi Kahou
Antoine Chassang
C. Gatta
Yoshua Bengio
FedML
280
3,870
0
19 Dec 2014
1