Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1907.11692
Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach
26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"RoBERTa: A Robustly Optimized BERT Pretraining Approach"
50 / 10,677 papers shown
Title
On Dimensional Linguistic Properties of the Word Embedding Space
Maosen Li
Siheng Chen
Ya Zhang
Florian Metze
38
3
0
05 Oct 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
284
7,574
0
02 Oct 2019
SlowMo: Improving Communication-Efficient Distributed SGD with Slow Momentum
Jianyu Wang
Vinayak Tantia
Nicolas Ballas
Michael G. Rabbat
91
201
0
01 Oct 2019
MMM: Multi-stage Multi-task Learning for Multi-choice Reading Comprehension
Di Jin
Shuyang Gao
Jiun-Yu Kao
Tagyoung Chung
Dilek Z. Hakkani-Tür
71
69
0
01 Oct 2019
TMLab: Generative Enhanced Model (GEM) for adversarial attacks
P. Niewinski
M. Pszona
M. Janicka
VLM
GAN
63
14
0
01 Oct 2019
A Simple and Effective Model for Answering Multi-span Questions
Elad Segal
Avia Efrat
Mor Shoham
Amir Globerson
Jonathan Berant
KELM
89
30
0
29 Sep 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
387
6,475
0
26 Sep 2019
Aspect and Opinion Term Extraction for Hotel Reviews using Transfer Learning and Auxiliary Labels
Yosef Ardhito Winatmoko
Ali Akbar Septiandri
Arie Pratama Sutiono
11
4
0
26 Sep 2019
Mixed Dimension Embeddings with Application to Memory-Efficient Recommendation Systems
Antonio A. Ginart
Maxim Naumov
Dheevatsa Mudigere
Jiyan Yang
James Zou
100
101
0
25 Sep 2019
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
Chen Zhu
Yu Cheng
Zhe Gan
S. Sun
Tom Goldstein
Jingjing Liu
AAML
294
443
0
25 Sep 2019
UNITER: UNiversal Image-TExt Representation Learning
Yen-Chun Chen
Linjie Li
Licheng Yu
Ahmed El Kholy
Faisal Ahmed
Zhe Gan
Yu Cheng
Jingjing Liu
VLM
OT
129
448
0
25 Sep 2019
Reducing Transformer Depth on Demand with Structured Dropout
Angela Fan
Edouard Grave
Armand Joulin
130
597
0
25 Sep 2019
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models
Cheolhyoung Lee
Kyunghyun Cho
Wanmo Kang
MoE
288
209
0
25 Sep 2019
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
359
947
0
24 Sep 2019
Technical report on Conversational Question Answering
Yingnan Ju
Fubang Zhao
Shijie Chen
Bowen Zheng
Xuefeng Yang
Yunfeng Liu
67
49
0
24 Sep 2019
Portuguese Named Entity Recognition using BERT-CRF
Fábio Souza
Rodrigo Nogueira
R. Lotufo
73
257
0
23 Sep 2019
Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Embeddings
Gregor Wiedemann
Steffen Remus
Avi Chawla
Chris Biemann
106
176
0
23 Sep 2019
TinyBERT: Distilling BERT for Natural Language Understanding
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
121
1,881
0
23 Sep 2019
Graph Convolutions over Constituent Trees for Syntax-Aware Semantic Role Labeling
Diego Marcheggiani
Ivan Titov
GNN
49
39
0
21 Sep 2019
AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models
Eric Wallace
Jens Tuyls
Junlin Wang
Sanjay Subramanian
Matt Gardner
Sameer Singh
MILM
82
138
0
19 Sep 2019
How Additional Knowledge can Improve Natural Language Commonsense Question Answering?
Arindam Mitra
Pratyay Banerjee
Kuntal Kumar Pal
Swaroop Mishra
Chitta Baral
KELM
84
31
0
19 Sep 2019
Language models and Automated Essay Scoring
Pedro Uría Rodríguez
Amir Jafari
C. Ormerod
67
86
0
18 Sep 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
358
1,922
0
17 Sep 2019
Span-based Joint Entity and Relation Extraction with Transformer Pre-training
Markus Eberts
A. Ulges
LRM
ViT
238
386
0
17 Sep 2019
K-BERT: Enabling Language Representation with Knowledge Graph
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Qi Ju
Haotang Deng
Ping Wang
313
796
0
17 Sep 2019
Frustratingly Easy Natural Question Answering
Lin Pan
Rishav Chakravarti
Anthony Ferritto
Michael R. Glass
A. Gliozzo
Salim Roukos
Radu Florian
Avirup Sil
59
14
0
11 Sep 2019
An Evalutation of Programming Language Models' performance on Software Defect Detection
Kailun Wang
ELM
31
0
0
10 Sep 2019
Span Selection Pre-training for Question Answering
Michael R. Glass
A. Gliozzo
Rishav Chakravarti
Anthony Ferritto
Lin Pan
G P Shrivatsa Bhargav
Dinesh Garg
Avirup Sil
RALM
94
73
0
09 Sep 2019
Pretrained Language Models for Sequential Sentence Classification
Arman Cohan
Iz Beltagy
Daniel King
Bhavana Dalvi
Daniel S. Weld
158
130
0
09 Sep 2019
Graph-Based Reasoning over Heterogeneous External Knowledge for Commonsense Question Answering
Shangwen Lv
Daya Guo
Jingjing Xu
Duyu Tang
Nan Duan
Ming Gong
Linjun Shou
Daxin Jiang
Guihong Cao
Songlin Hu
RALM
59
206
0
09 Sep 2019
Reasoning Over Semantic-Level Graph for Fact Checking
Wanjun Zhong
Jingjing Xu
Duyu Tang
Zenan Xu
Nan Duan
M. Zhou
Jiahai Wang
Jian Yin
HILM
GNN
267
171
0
09 Sep 2019
Transfer Learning Robustness in Multi-Class Categorization by Fine-Tuning Pre-Trained Contextualized Language Models
Xinyi Liu
A. Wangperawong
25
3
0
08 Sep 2019
Pretrained AI Models: Performativity, Mobility, and Change
Lav Varshney
N. Keskar
R. Socher
68
20
0
07 Sep 2019
"Going on a vacation" takes longer than "Going for a walk": A Study of Temporal Commonsense Understanding
Ben Zhou
Daniel Khashabi
Qiang Ning
Dan Roth
AIMat
105
200
0
06 Sep 2019
Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity
Anne Lauscher
Ivan Vulić
Edoardo Ponti
Anna Korhonen
Goran Glavaš
SSL
85
58
0
05 Sep 2019
Semantics-aware BERT for Language Understanding
Zhuosheng Zhang
Yuwei Wu
Zhao Hai
Z. Li
Shuailiang Zhang
Xi Zhou
Xiang Zhou
55
370
0
05 Sep 2019
TabFact: A Large-scale Dataset for Table-based Fact Verification
Wenhu Chen
Hongmin Wang
Jianshu Chen
Yunkai Zhang
Hong Wang
Shiyang Li
Xiyou Zhou
William Yang Wang
LMTD
125
514
0
05 Sep 2019
KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning
Bill Yuchen Lin
Xinyue Chen
Jamin Chen
Xiang Ren
89
464
0
04 Sep 2019
From 'F' to Á' on the N.Y. Regents Science Exams: An Overview of the Aristo Project
Peter Clark
Oren Etzioni
Daniel Khashabi
Tushar Khot
Bhavana Dalvi
...
Niket Tandon
Sumithra Bhakthavatsalam
Dirk Groeneveld
Michal Guerquin
Michael Schmitz
ELM
89
99
0
04 Sep 2019
How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings
Kawin Ethayarajh
98
878
0
02 Sep 2019
QuASE: Question-Answer Driven Sentence Encoding
Hangfeng He
Qiang Ning
Dan Roth
40
1
0
01 Sep 2019
NEZHA: Neural Contextualized Representation for Chinese Language Understanding
Junqiu Wei
Xiaozhe Ren
Xiaoguang Li
Wenyong Huang
Yi-Lun Liao
Yasheng Wang
Jianghao Lin
Xin Jiang
Xiao Chen
Qun Liu
79
116
0
31 Aug 2019
Evaluation Benchmarks and Learning Criteria for Discourse-Aware Sentence Representations
Mingda Chen
Zewei Chu
Kevin Gimpel
78
46
0
31 Aug 2019
EntEval: A Holistic Evaluation Benchmark for Entity Representations
Mingda Chen
Zewei Chu
Yang Chen
K. Stratos
Kevin Gimpel
51
12
0
31 Aug 2019
A Morpho-Syntactically Informed LSTM-CRF Model for Named Entity Recognition
L. Simeonova
K. Simov
P. Osenova
Preslav Nakov
33
8
0
27 Aug 2019
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
1.3K
12,343
0
27 Aug 2019
Patient Knowledge Distillation for BERT Model Compression
S. Sun
Yu Cheng
Zhe Gan
Jingjing Liu
151
843
0
25 Aug 2019
BERT for Coreference Resolution: Baselines and Analysis
Mandar Joshi
Omer Levy
Daniel S. Weld
Luke Zettlemoyer
102
322
0
24 Aug 2019
Well-Read Students Learn Better: On the Importance of Pre-training Compact Models
Iulia Turc
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
71
225
0
23 Aug 2019
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
Weijie Su
Xizhou Zhu
Yue Cao
Bin Li
Lewei Lu
Furu Wei
Jifeng Dai
VLM
MLLM
SSL
195
1,671
0
22 Aug 2019
Previous
1
2
3
...
212
213
214
Next