Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.11942
Cited By
v1
v2
v3
v4
v5
v6 (latest)
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Github (3271★)
Papers citing
"ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"
50 / 2,935 papers shown
Title
Investigating Typed Syntactic Dependencies for Targeted Sentiment Classification Using Graph Attention Neural Network
Xuefeng Bai
Pengbo Liu
Yue Zhang
GNN
50
3
0
22 Feb 2020
Training Question Answering Models From Synthetic Data
Raul Puri
Ryan Spring
M. Patwary
Mohammad Shoeybi
Bryan Catanzaro
ELM
81
160
0
22 Feb 2020
CoLES: Contrastive Learning for Event Sequences with Self-Supervision
Dmitrii Babaev
Ivan Kireev
Nikita Ovsov
Maria Ivanova
Gleb Gusev
Ivan Nazarov
Alexander Tuzhilin
SSL
AI4TS
58
27
0
19 Feb 2020
Convergence of End-to-End Training in Deep Unsupervised Contrastive Learning
Zixin Wen
SSL
68
3
0
17 Feb 2020
SBERT-WK: A Sentence Embedding Method by Dissecting BERT-based Word Models
Bin Wang
C.-C. Jay Kuo
50
156
0
16 Feb 2020
Towards Detection of Subjective Bias using Contextualized Word Embeddings
Tanvi Dadu
Kartikey Pant
R. Mamidi
42
22
0
16 Feb 2020
Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping
Jesse Dodge
Gabriel Ilharco
Roy Schwartz
Ali Farhadi
Hannaneh Hajishirzi
Noah A. Smith
103
598
0
15 Feb 2020
TwinBERT: Distilling Knowledge to Twin-Structured BERT Models for Efficient Retrieval
Wenhao Lu
Jian Jiao
Ruofei Zhang
60
50
0
14 Feb 2020
Transformer on a Diet
Chenguang Wang
Zihao Ye
Aston Zhang
Zheng Zhang
Alex Smola
80
8
0
14 Feb 2020
HULK: An Energy Efficiency Benchmark Platform for Responsible Natural Language Processing
Xiyou Zhou
Zhiyu Zoey Chen
Xiaoyong Jin
Wenjie Wang
78
34
0
14 Feb 2020
How Much Knowledge Can You Pack Into the Parameters of a Language Model?
Adam Roberts
Colin Raffel
Noam M. Shazeer
KELM
144
898
0
10 Feb 2020
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
Canwen Xu
Wangchunshu Zhou
Tao Ge
Furu Wei
Ming Zhou
348
201
0
07 Feb 2020
Aligning the Pretraining and Finetuning Objectives of Language Models
Nuo Wang Pierse
Jing Lu
AI4CE
35
2
0
05 Feb 2020
Pseudo-Bidirectional Decoding for Local Sequence Transduction
Wangchunshu Zhou
Tao Ge
Ke Xu
58
3
0
31 Jan 2020
Bringing Stories Alive: Generating Interactive Fiction Worlds
Prithviraj Ammanabrolu
W. Cheung
Dan Tu
William Broniec
Mark O. Riedl
85
51
0
28 Jan 2020
Retrospective Reader for Machine Reading Comprehension
Zhuosheng Zhang
Junjie Yang
Hai Zhao
RALM
104
227
0
27 Jan 2020
DUMA: Reading Comprehension with Transposition Thinking
Pengfei Zhu
Hai Zhao
Xiaoguang Li
AI4CE
82
35
0
26 Jan 2020
ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation
Dongling Xiao
Han Zhang
Yukun Li
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
85
127
0
26 Jan 2020
BERT's output layer recognizes all hidden layers? Some Intriguing Phenomena and a simple way to boost BERT
Wei-Tsung Kao
Tsung-Han Wu
Po-Han Chi
Chun-Cheng Hsieh
Hung-yi Lee
SSL
44
5
0
25 Jan 2020
Multi-task self-supervised learning for Robust Speech Recognition
Mirco Ravanelli
Jianyuan Zhong
Santiago Pascual
P. Swietojanski
João Monteiro
J. Trmal
Yoshua Bengio
SSL
288
290
0
25 Jan 2020
PoWER-BERT: Accelerating BERT Inference via Progressive Word-vector Elimination
Saurabh Goyal
Anamitra R. Choudhury
Saurabh ManishRaje
Venkatesan T. Chakaravarthy
Yogish Sabharwal
Ashish Verma
96
18
0
24 Jan 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
668
4,935
0
23 Jan 2020
Normalization of Input-output Shared Embeddings in Text Generation Models
Jinyang Liu
Yujia Zhai
Zizhong Chen
34
0
0
22 Jan 2020
A multimodal deep learning approach for named entity recognition from social media
M. Asgari-Chenaghlu
M. Feizi-Derakhshi
Leili Farzinvash
M. Balafar
C. Motamed
60
29
0
19 Jan 2020
RobBERT: a Dutch RoBERTa-based Language Model
Pieter Delobelle
Thomas Winters
Bettina Berendt
86
240
0
17 Jan 2020
Graph-Bert: Only Attention is Needed for Learning Graph Representations
Jiawei Zhang
Haopeng Zhang
Congying Xia
Li Sun
107
306
0
15 Jan 2020
A BERT based Sentiment Analysis and Key Entity Detection Approach for Online Financial Texts
Lin Zhao
Lin Li
Xinhao Zheng
91
67
0
14 Jan 2020
CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark for Chinese
Liang Xu
Yu Tong
Qianqian Dong
Yixuan Liao
Cong Yu
Yin Tian
Weitang Liu
Lu Li
Caiquan Liu
Xuanwei Zhang
96
54
0
13 Jan 2020
AdaBERT: Task-Adaptive BERT Compression with Differentiable Neural Architecture Search
Daoyuan Chen
Yaliang Li
Minghui Qiu
Zhen Wang
Bofang Li
Bolin Ding
Hongbo Deng
Jun Huang
Wei Lin
Jingren Zhou
MQ
97
104
0
13 Jan 2020
ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training
Weizhen Qi
Yu Yan
Yeyun Gong
Dayiheng Liu
Nan Duan
Jiusheng Chen
Ruofei Zhang
Ming Zhou
AI4TS
140
450
0
13 Jan 2020
Assessment Modeling: Fundamental Pre-training Tasks for Interactive Educational Systems
Youngduck Choi
Youngnam Lee
Junghyun Cho
Jineon Baek
Dongmin Shin
...
Seewoo Lee
Youngmin Cha
Chan Bae
Byungsoo Kim
Jaewe Heo
AI4Ed
31
14
0
01 Jan 2020
Clinical XLNet: Modeling Sequential Clinical Notes and Predicting Prolonged Mechanical Ventilation
Kexin Huang
Abhishek Singh
Sitong Chen
E. Moseley
Chih-ying Deng
Naomi George
C. Lindvall
119
59
0
27 Dec 2019
Is Attention All What You Need? -- An Empirical Investigation on Convolution-Based Active Memory and Self-Attention
Thomas D. Dowdell
Hongyu Zhang
29
4
0
27 Dec 2019
BERTje: A Dutch BERT Model
Wietse de Vries
Andreas van Cranenburgh
Arianna Bisazza
Tommaso Caselli
Gertjan van Noord
Malvina Nissim
VLM
SSeg
95
295
0
19 Dec 2019
WaLDORf: Wasteless Language-model Distillation On Reading-comprehension
J. Tian
A. Kreuzer
Pai-Hung Chen
Hans-Martin Will
VLM
60
3
0
13 Dec 2019
FlauBERT: Unsupervised Language Model Pre-training for French
Hang Le
Loïc Vial
Jibril Frej
Vincent Segonne
Maximin Coavoux
Benjamin Lecouteux
A. Allauzen
Benoît Crabbé
Laurent Besacier
D. Schwab
AI4CE
111
401
0
11 Dec 2019
MITAS: A Compressed Time-Domain Audio Separation Network with Parameter Sharing
Chao-I Tuan
Yuan-Kuei Wu
Hung-yi Lee
Yu Tsao
28
2
0
09 Dec 2019
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline
Vishvak Murahari
Dhruv Batra
Devi Parikh
Abhishek Das
VLM
109
117
0
05 Dec 2019
Bimodal Speech Emotion Recognition Using Pre-Trained Language Models
Verena Heusser
Niklas Freymuth
Stefan Constantin
A. Waibel
92
26
0
29 Nov 2019
Low Rank Factorization for Compact Multi-Head Self-Attention
Sneha Mehta
Huzefa Rangwala
Naren Ramakrishnan
42
5
0
26 Nov 2019
Efficient Attention Mechanism for Visual Dialog that can Handle All the Interactions between Multiple Inputs
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
99
7
0
26 Nov 2019
Pre-Training of Deep Bidirectional Protein Sequence Representations with Structural Information
Seonwoo Min
Seunghyun Park
Siwon Kim
Hyun-Soo Choi
Byunghan Lee
Sungroh Yoon
SSL
71
63
0
25 Nov 2019
Global Greedy Dependency Parsing
Z. Li
Zhao Hai
Kevin Parnow
113
31
0
20 Nov 2019
Vision-Language Navigation with Self-Supervised Auxiliary Reasoning Tasks
Fengda Zhu
Yi Zhu
Xiaojun Chang
Xiaodan Liang
LRM
115
243
0
18 Nov 2019
Unsupervised Pre-training for Natural Language Generation: A Literature Review
Yuanxin Liu
Zheng Lin
SSL
AI4CE
38
3
0
13 Nov 2019
ZiMM: a deep learning model for long term and blurry relapses with non-clinical claims data
A. Kabeshova
Yiyang Yu
Bertrand Lukacs
Emmanuel Bacry
Stéphane Gaïffas
VLM
MedIm
24
2
0
13 Nov 2019
KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation
Xiaozhi Wang
Tianyu Gao
Zhaocheng Zhu
Zhengyan Zhang
Zhiyuan Liu
Juan-Zi Li
Jian Tang
170
675
0
13 Nov 2019
CamemBERT: a Tasty French Language Model
Louis Martin
Benjamin Muller
Pedro Ortiz Suarez
Yoann Dupont
Laurent Romary
Eric Villemonte de la Clergerie
Djamé Seddah
Benoît Sagot
143
981
0
10 Nov 2019
ConveRT: Efficient and Accurate Conversational Representations from Transformers
Matthew Henderson
I. Casanueva
Nikola Mrkvsić
Pei-hao Su
Tsung-Hsien
Ivan Vulić
109
200
0
09 Nov 2019
Hierarchical Graph Network for Multi-hop Question Answering
Yuwei Fang
S. Sun
Zhe Gan
R. Pillai
Shuohang Wang
Jingjing Liu
136
173
0
09 Nov 2019
Previous
1
2
3
...
57
58
59
Next