Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.11942
Cited By
v1
v2
v3
v4
v5
v6 (latest)
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Github (3271★)
Papers citing
"ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"
35 / 2,935 papers shown
Title
MKD: a Multi-Task Knowledge Distillation Approach for Pretrained Language Models
Linqing Liu
Haiquan Wang
Jimmy J. Lin
R. Socher
Caiming Xiong
65
21
0
09 Nov 2019
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
T. Zhao
135
563
0
08 Nov 2019
Transforming Wikipedia into Augmented Data for Query-Focused Summarization
Haichao Zhu
Li Dong
Furu Wei
Bing Qin
Ting Liu
RALM
61
22
0
08 Nov 2019
Blockwise Self-Attention for Long Document Understanding
J. Qiu
Hao Ma
Omer Levy
Scott Yih
Sinong Wang
Jie Tang
111
254
0
07 Nov 2019
Deepening Hidden Representations from Pre-trained Language Models
Junjie Yang
Hai Zhao
24
10
0
05 Nov 2019
BAS: An Answer Selection Method Using BERT Language Model
Jamshid Mozafari
A. Fatemi
M. Nematbakhsh
45
17
0
04 Nov 2019
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
Guillaume Wenzek
Marie-Anne Lachaux
Alexis Conneau
Vishrav Chaudhary
Francisco Guzmán
Armand Joulin
Edouard Grave
124
658
0
01 Nov 2019
A neural document language modeling framework for spoken document retrieval
Li-Phen Yen
Zheng-Yu Wu
Kuan-Yu Chen
3DGS
44
0
0
31 Oct 2019
Parameter Sharing Decoder Pair for Auto Composing
Xu Zhao
MoE
19
0
0
31 Oct 2019
Ensembling Strategies for Answering Natural Questions
Anthony Ferritto
Lin Pan
Rishav Chakravarti
Salim Roukos
Radu Florian
J. William Murdock
Avirup Sil
ELM
42
0
0
30 Oct 2019
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
M. Lewis
Yinhan Liu
Naman Goyal
Marjan Ghazvininejad
Abdel-rahman Mohamed
Omer Levy
Veselin Stoyanov
Luke Zettlemoyer
AIMat
VLM
268
10,907
0
29 Oct 2019
What does BERT Learn from Multiple-Choice Reading Comprehension Datasets?
Chenglei Si
Shuohang Wang
Min-Yen Kan
Jing Jiang
88
53
0
28 Oct 2019
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
Andy T. Liu
Shu-Wen Yang
Po-Han Chi
Po-Chun Hsu
Hung-yi Lee
SSL
157
374
0
25 Oct 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
602
20,418
0
23 Oct 2019
Injecting Hierarchy with U-Net Transformers
David Donahue
Vladislav Lialin
Anna Rumshisky
AI4CE
26
1
0
16 Oct 2019
Structured Pruning of a BERT-based Question Answering Model
J. Scott McCarley
Rishav Chakravarti
Avirup Sil
96
53
0
14 Oct 2019
Structured Pruning of Large Language Models
Ziheng Wang
Jeremy Wohlwend
Tao Lei
96
293
0
10 Oct 2019
On the adequacy of untuned warmup for adaptive optimization
Jerry Ma
Denis Yarats
106
70
0
09 Oct 2019
HuggingFace's Transformers: State-of-the-art Natural Language Processing
Thomas Wolf
Lysandre Debut
Victor Sanh
Julien Chaumond
Clement Delangue
...
Teven Le Scao
Sylvain Gugger
Mariama Drame
Quentin Lhoest
Alexander M. Rush
AI4CE
110
1,960
0
09 Oct 2019
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
Chen Zhu
Yu Cheng
Zhe Gan
S. Sun
Tom Goldstein
Jingjing Liu
AAML
296
443
0
25 Sep 2019
UNITER: UNiversal Image-TExt Representation Learning
Yen-Chun Chen
Linjie Li
Licheng Yu
Ahmed El Kholy
Faisal Ahmed
Zhe Gan
Yu Cheng
Jingjing Liu
VLM
OT
131
448
0
25 Sep 2019
Portuguese Named Entity Recognition using BERT-CRF
Fábio Souza
Rodrigo Nogueira
R. Lotufo
73
258
0
23 Sep 2019
TinyBERT: Distilling BERT for Natural Language Understanding
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
126
1,881
0
23 Sep 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
363
1,925
0
17 Sep 2019
On Identifiability in Transformers
Gino Brunner
Yang Liu
Damian Pascual
Oliver Richter
Massimiliano Ciaramita
Roger Wattenhofer
ViT
77
189
0
12 Aug 2019
Semi-supervised Thai Sentence Segmentation Using Local and Distant Word Representations
Chanatip Saetia
Ekapol Chuangsuwanich
Tawunrat Chalothorn
P. Vateekul
74
5
0
04 Aug 2019
DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks
Simon Wiedemann
H. Kirchhoffer
Stefan Matlage
Paul Haase
Arturo Marbán
...
Ahmed Osman
D. Marpe
H. Schwarz
Thomas Wiegand
Wojciech Samek
106
97
0
27 Jul 2019
XLNet: Generalized Autoregressive Pretraining for Language Understanding
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
246
8,462
0
19 Jun 2019
Pre-Training with Whole Word Masking for Chinese BERT
Yiming Cui
Wanxiang Che
Ting Liu
Bing Qin
Ziqing Yang
46
186
0
19 Jun 2019
Survey on Evaluation Methods for Dialogue Systems
Jan Deriu
Álvaro Rodrigo
Arantxa Otegi
Guillermo Echegoyen
S. Rosset
Eneko Agirre
Mark Cieliebak
116
284
0
10 May 2019
An Attentive Survey of Attention Models
S. Chaudhari
Varun Mithal
Gungor Polatkan
R. Ramanath
192
666
0
05 Apr 2019
Recent Advances in Natural Language Inference: A Survey of Benchmarks, Resources, and Approaches
Shane Storks
Qiaozi Gao
J. Chai
98
132
0
02 Apr 2019
Tensorized Embedding Layers for Efficient Model Compression
Oleksii Hrinchuk
Valentin Khrulkov
L. Mirvakhabova
Elena Orlova
Ivan Oseledets
91
73
0
30 Jan 2019
Sentence transition matrix: An efficient approach that preserves sentence semantics
Myeongjun Jang
Pilsung Kang
21
2
0
16 Jan 2019
Impact of Power System Partitioning on the Efficiency of Distributed Multi-Step Optimization
Dongliang Chen
A. Bucchiarone
Zhihan Lv
42
4
0
31 May 2016
Previous
1
2
3
...
57
58
59