ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXivPDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

34 / 9,183 papers shown
Title
AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models
AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models
Eric Wallace
Jens Tuyls
Junlin Wang
Sanjay Subramanian
Matt Gardner
Sameer Singh
MILM
28
137
0
19 Sep 2019
How Additional Knowledge can Improve Natural Language Commonsense
  Question Answering?
How Additional Knowledge can Improve Natural Language Commonsense Question Answering?
Arindam Mitra
Pratyay Banerjee
Kuntal Kumar Pal
Swaroop Mishra
Chitta Baral
KELM
35
31
0
19 Sep 2019
Language models and Automated Essay Scoring
Language models and Automated Essay Scoring
Pedro Uría Rodríguez
Amir Jafari
C. Ormerod
30
82
0
18 Sep 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
249
1,838
0
17 Sep 2019
Span-based Joint Entity and Relation Extraction with Transformer
  Pre-training
Span-based Joint Entity and Relation Extraction with Transformer Pre-training
Markus Eberts
A. Ulges
LRM
ViT
164
382
0
17 Sep 2019
K-BERT: Enabling Language Representation with Knowledge Graph
K-BERT: Enabling Language Representation with Knowledge Graph
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Qi Ju
Haotang Deng
Ping Wang
231
779
0
17 Sep 2019
Frustratingly Easy Natural Question Answering
Frustratingly Easy Natural Question Answering
Lin Pan
Rishav Chakravarti
Anthony Ferritto
Michael R. Glass
A. Gliozzo
Salim Roukos
Radu Florian
Avirup Sil
29
14
0
11 Sep 2019
Span Selection Pre-training for Question Answering
Span Selection Pre-training for Question Answering
Michael R. Glass
A. Gliozzo
Rishav Chakravarti
Anthony Ferritto
Lin Pan
G P Shrivatsa Bhargav
Dinesh Garg
Avirup Sil
RALM
43
70
0
09 Sep 2019
Pretrained Language Models for Sequential Sentence Classification
Pretrained Language Models for Sequential Sentence Classification
Arman Cohan
Iz Beltagy
Daniel King
Bhavana Dalvi
Daniel S. Weld
32
128
0
09 Sep 2019
Graph-Based Reasoning over Heterogeneous External Knowledge for
  Commonsense Question Answering
Graph-Based Reasoning over Heterogeneous External Knowledge for Commonsense Question Answering
Shangwen Lv
Daya Guo
Jingjing Xu
Duyu Tang
Nan Duan
Ming Gong
Linjun Shou
Daxin Jiang
Guihong Cao
Songlin Hu
RALM
25
202
0
09 Sep 2019
Reasoning Over Semantic-Level Graph for Fact Checking
Reasoning Over Semantic-Level Graph for Fact Checking
Wanjun Zhong
Jingjing Xu
Duyu Tang
Zenan Xu
Nan Duan
M. Zhou
Jiahai Wang
Jian Yin
HILM
GNN
185
166
0
09 Sep 2019
Semantics-aware BERT for Language Understanding
Semantics-aware BERT for Language Understanding
Zhuosheng Zhang
Yuwei Wu
Zhao Hai
Z. Li
Shuailiang Zhang
Xi Zhou
Xiang Zhou
26
365
0
05 Sep 2019
KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning
KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning
Bill Yuchen Lin
Xinyue Chen
Jamin Chen
Xiang Ren
31
460
0
04 Sep 2019
From 'F' to Á' on the N.Y. Regents Science Exams: An Overview of the
  Aristo Project
From 'F' to Á' on the N.Y. Regents Science Exams: An Overview of the Aristo Project
Peter Clark
Oren Etzioni
Daniel Khashabi
Tushar Khot
Bhavana Dalvi
...
Niket Tandon
Sumithra Bhakthavatsalam
Dirk Groeneveld
Michal Guerquin
Michael Schmitz
ELM
31
99
0
04 Sep 2019
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
123
11,840
0
27 Aug 2019
Patient Knowledge Distillation for BERT Model Compression
Patient Knowledge Distillation for BERT Model Compression
S. Sun
Yu Cheng
Zhe Gan
Jingjing Liu
83
832
0
25 Aug 2019
BERT for Coreference Resolution: Baselines and Analysis
BERT for Coreference Resolution: Baselines and Analysis
Mandar Joshi
Omer Levy
Daniel S. Weld
Luke Zettlemoyer
28
320
0
24 Aug 2019
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
Weijie Su
Xizhou Zhu
Yue Cao
Bin Li
Lewei Lu
Furu Wei
Jifeng Dai
VLM
MLLM
SSL
87
1,652
0
22 Aug 2019
Align, Mask and Select: A Simple Method for Incorporating Commonsense
  Knowledge into Language Representation Models
Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models
Zhiquan Ye
Qian Chen
Wen Wang
Zhenhua Ling
27
68
0
19 Aug 2019
Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal
  Pre-training
Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training
Gen Li
Nan Duan
Yuejian Fang
Ming Gong
Daxin Jiang
Ming Zhou
SSL
VLM
MLLM
91
895
0
16 Aug 2019
Reasoning Over Paragraph Effects in Situations
Reasoning Over Paragraph Effects in Situations
Kevin Lin
Oyvind Tafjord
Peter Clark
Matt Gardner
36
115
0
16 Aug 2019
SenseBERT: Driving Some Sense into BERT
SenseBERT: Driving Some Sense into BERT
Yoav Levine
Barak Lenz
Or Dagan
Ori Ram
Dan Padnos
Or Sharir
Shai Shalev-Shwartz
Amnon Shashua
Y. Shoham
SSL
27
186
0
15 Aug 2019
A Multi-Turn Emotionally Engaging Dialog Model
A Multi-Turn Emotionally Engaging Dialog Model
Yubo Xie
Ekaterina Svikhnushina
P. Pu
24
15
0
15 Aug 2019
StructBERT: Incorporating Language Structures into Pre-training for Deep
  Language Understanding
StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding
Wei Wang
Bin Bi
Ming Yan
Chen Henry Wu
Zuyi Bao
Jiangnan Xia
Liwei Peng
Luo Si
31
260
0
13 Aug 2019
On Identifiability in Transformers
On Identifiability in Transformers
Gino Brunner
Yang Liu
Damian Pascual
Oliver Richter
Massimiliano Ciaramita
Roger Wattenhofer
ViT
30
187
0
12 Aug 2019
Semi-supervised Thai Sentence Segmentation Using Local and Distant Word
  Representations
Semi-supervised Thai Sentence Segmentation Using Local and Distant Word Representations
Chanatip Saetia
Ekapol Chuangsuwanich
Tawunrat Chalothorn
P. Vateekul
32
5
0
04 Aug 2019
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
S. Rothe
Shashi Narayan
Aliaksei Severyn
SILM
76
433
0
29 Jul 2019
SpanBERT: Improving Pre-training by Representing and Predicting Spans
SpanBERT: Improving Pre-training by Representing and Predicting Spans
Mandar Joshi
Danqi Chen
Yinhan Liu
Daniel S. Weld
Luke Zettlemoyer
Omer Levy
85
1,947
0
24 Jul 2019
BERTphone: Phonetically-Aware Encoder Representations for
  Utterance-Level Speaker and Language Recognition
BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition
Shaoshi Ling
Julian Salazar
Yuzong Liu
Katrin Kirchhoff
SSL
33
28
0
30 Jun 2019
Taming Pretrained Transformers for Extreme Multi-label Text
  Classification
Taming Pretrained Transformers for Extreme Multi-label Text Classification
Wei-Cheng Chang
Hsiang-Fu Yu
Kai Zhong
Yiming Yang
Inderjit Dhillon
27
20
0
07 May 2019
Recent Advances in Natural Language Inference: A Survey of Benchmarks,
  Resources, and Approaches
Recent Advances in Natural Language Inference: A Survey of Benchmarks, Resources, and Approaches
Shane Storks
Qiaozi Gao
J. Chai
28
128
0
02 Apr 2019
Sentence transition matrix: An efficient approach that preserves
  sentence semantics
Sentence transition matrix: An efficient approach that preserves sentence semantics
Myeongjun Jang
Pilsung Kang
19
2
0
16 Jan 2019
On the Benefit of Width for Neural Networks: Disappearance of Bad Basins
On the Benefit of Width for Neural Networks: Disappearance of Bad Basins
Dawei Li
Tian Ding
Ruoyu Sun
42
38
0
28 Dec 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
314
7,020
0
20 Apr 2018
Previous
123...182183184