Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.05609
Cited By
Load What You Need: Smaller Versions of Multilingual BERT
12 October 2020
Amine Abdaoui
Camille Pradel
Grégoire Sigel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Load What You Need: Smaller Versions of Multilingual BERT"
12 / 12 papers shown
Title
On Multilingual Encoder Language Model Compression for Low-Resource Languages
Daniil Gurgurov
Michal Gregor
Josef van Genabith
Simon Ostermann
77
0
0
22 May 2025
DeFINE: DEep Factorized INput Token Embeddings for Neural Sequence Modeling
Sachin Mehta
Rik Koncel-Kedziorski
Mohammad Rastegari
Hannaneh Hajishirzi
AI4TS
55
23
0
27 Nov 2019
Unsupervised Cross-lingual Representation Learning at Scale
Alexis Conneau
Kartikay Khandelwal
Naman Goyal
Vishrav Chaudhary
Guillaume Wenzek
Francisco Guzmán
Edouard Grave
Myle Ott
Luke Zettlemoyer
Veselin Stoyanov
126
6,454
0
05 Nov 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
92
7,386
0
02 Oct 2019
Extremely Small BERT Models from Mixed-Vocabulary Training
Sanqiang Zhao
Raghav Gupta
Yang Song
Denny Zhou
VLM
33
53
0
25 Sep 2019
Small and Practical BERT Models for Sequence Labeling
Henry Tsai
Jason Riesa
Melvin Johnson
N. Arivazhagan
Xin Li
Amelia Archer
VLM
22
121
0
31 Aug 2019
Patient Knowledge Distillation for BERT Model Compression
S. Sun
Yu Cheng
Zhe Gan
Jingjing Liu
96
833
0
25 Aug 2019
Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
Raphael Tang
Yao Lu
Linqing Liu
Lili Mou
Olga Vechtomova
Jimmy J. Lin
47
419
0
28 Mar 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
815
93,936
0
11 Oct 2018
XNLI: Evaluating Cross-lingual Sentence Representations
Alexis Conneau
Guillaume Lample
Ruty Rinott
Adina Williams
Samuel R. Bowman
Holger Schwenk
Veselin Stoyanov
ELM
44
1,366
0
13 Sep 2018
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference
Adina Williams
Nikita Nangia
Samuel R. Bowman
344
4,444
0
18 Apr 2017
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
153
19,448
0
09 Mar 2015
1