ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXivPDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 9,299 papers shown
Title
Evaluating Commonsense in Pre-trained Language Models
Evaluating Commonsense in Pre-trained Language Models
Xuhui Zhou
Yue Zhang
Leyang Cui
Dandan Huang
AI4MH
LRM
25
182
0
27 Nov 2019
Pre-Training of Deep Bidirectional Protein Sequence Representations with
  Structural Information
Pre-Training of Deep Bidirectional Protein Sequence Representations with Structural Information
Seonwoo Min
Seunghyun Park
Siwon Kim
Hyun-Soo Choi
Byunghan Lee
Sungroh Yoon
SSL
22
62
0
25 Nov 2019
A Transformer-based approach to Irony and Sarcasm detection
A Transformer-based approach to Irony and Sarcasm detection
Rolandos Alexandros Potamias
Georgios Siolas
A. Stafylopatis
33
207
0
23 Nov 2019
Global Greedy Dependency Parsing
Global Greedy Dependency Parsing
Z. Li
Zhao Hai
Kevin Parnow
36
31
0
20 Nov 2019
The Eighth Dialog System Technology Challenge
The Eighth Dialog System Technology Challenge
Seokhwan Kim
Michel Galley
Chulaka Gunasekara
Sungjin Lee
Adam Atkinson
...
Tim K. Marks
Abhinav Rastogi
Xiaoxue Zang
Srinivas Sunkara
Raghav Gupta
VLM
27
65
0
14 Nov 2019
Sato: Contextual Semantic Type Detection in Tables
Sato: Contextual Semantic Type Detection in Tables
Dan Zhang
Yoshihiko Suhara
Jinfeng Li
Madelon Hulsebos
cCaugatay Demiralp
W. Tan
LMTD
24
15
0
14 Nov 2019
What do you mean, BERT? Assessing BERT as a Distributional Semantics
  Model
What do you mean, BERT? Assessing BERT as a Distributional Semantics Model
Timothee Mickus
Denis Paperno
Mathieu Constant
Kees van Deemter
34
45
0
13 Nov 2019
Adapting and evaluating a deep learning language model for clinical
  why-question answering
Adapting and evaluating a deep learning language model for clinical why-question answering
Andrew Wen
Mohamed Y. Elwazir
Sungrim Moon
Jungwei Fan
LM&MA
24
31
0
13 Nov 2019
Neural Duplicate Question Detection without Labeled Training Data
Neural Duplicate Question Detection without Labeled Training Data
Andreas Rucklé
N. Moosavi
Iryna Gurevych
OOD
AAML
19
11
0
13 Nov 2019
Attending to Entities for Better Text Understanding
Attending to Entities for Better Text Understanding
Pengxiang Cheng
K. Erk
LRM
24
37
0
11 Nov 2019
TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer
  Sentence Selection
TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection
Siddhant Garg
Thuy Vu
Alessandro Moschitti
41
214
0
11 Nov 2019
Improving BERT Fine-tuning with Embedding Normalization
Wenxuan Zhou
Junyi Du
Xiang Ren
21
6
0
10 Nov 2019
Effectiveness of self-supervised pre-training for speech recognition
Effectiveness of self-supervised pre-training for speech recognition
Alexei Baevski
Michael Auli
Abdel-rahman Mohamed
SSL
32
147
0
10 Nov 2019
CamemBERT: a Tasty French Language Model
CamemBERT: a Tasty French Language Model
Louis Martin
Benjamin Muller
Pedro Ortiz Suarez
Yoann Dupont
Laurent Romary
Eric Villemonte de la Clergerie
Djamé Seddah
Benoît Sagot
56
960
0
10 Nov 2019
ConveRT: Efficient and Accurate Conversational Representations from
  Transformers
ConveRT: Efficient and Accurate Conversational Representations from Transformers
Matthew Henderson
I. Casanueva
Nikola Mrkvsić
Pei-hao Su
Tsung-Hsien
Ivan Vulić
47
196
0
09 Nov 2019
E-BERT: Efficient-Yet-Effective Entity Embeddings for BERT
E-BERT: Efficient-Yet-Effective Entity Embeddings for BERT
Nina Poerner
Ulli Waltinger
Hinrich Schütze
52
157
0
09 Nov 2019
Improving Machine Reading Comprehension via Adversarial Training
Improving Machine Reading Comprehension via Adversarial Training
Ziqing Yang
Yiming Cui
Wanxiang Che
Ting Liu
Shijin Wang
Guoping Hu
32
17
0
09 Nov 2019
How Decoding Strategies Affect the Verifiability of Generated Text
How Decoding Strategies Affect the Verifiability of Generated Text
Luca Massarelli
Fabio Petroni
Aleksandra Piktus
Myle Ott
Tim Rocktaschel
Vassilis Plachouras
Fabrizio Silvestri
Sebastian Riedel
33
50
0
09 Nov 2019
Negated and Misprimed Probes for Pretrained Language Models: Birds Can
  Talk, But Cannot Fly
Negated and Misprimed Probes for Pretrained Language Models: Birds Can Talk, But Cannot Fly
Nora Kassner
Hinrich Schütze
28
318
0
08 Nov 2019
What Would Elsa Do? Freezing Layers During Transformer Fine-Tuning
What Would Elsa Do? Freezing Layers During Transformer Fine-Tuning
Jaejun Lee
Raphael Tang
Jimmy J. Lin
34
121
0
08 Nov 2019
Certified Data Removal from Machine Learning Models
Certified Data Removal from Machine Learning Models
Chuan Guo
Tom Goldstein
Awni Y. Hannun
Laurens van der Maaten
MU
57
424
0
08 Nov 2019
The TechQA Dataset
The TechQA Dataset
Vittorio Castelli
Rishav Chakravarti
Saswati Dana
Anthony Ferritto
Radu Florian
...
Andrzej Sakrajda
Avirup Sil
Rosario A. Uceda-Sosa
T. Ward
Rong Zhang
34
45
0
08 Nov 2019
Blockwise Self-Attention for Long Document Understanding
Blockwise Self-Attention for Long Document Understanding
J. Qiu
Hao Ma
Omer Levy
Scott Yih
Sinong Wang
Jie Tang
30
252
0
07 Nov 2019
S2ORC: The Semantic Scholar Open Research Corpus
S2ORC: The Semantic Scholar Open Research Corpus
Kyle Lo
Lucy Lu Wang
Mark Neumann
Rodney Michael Kinney
Daniel S. Weld
OffRL
AI4CE
56
10
0
07 Nov 2019
Infusing Knowledge into the Textual Entailment Task Using Graph
  Convolutional Networks
Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks
Pavan Kapanipathi
Veronika Thost
S. Patel
Spencer Whitehead
Ibrahim Abdelaziz
...
R. Chulaka Gunasekara
B. Makni
Nicholas Mattei
Kartik Talamadupula
Achille Fokoue
47
45
0
05 Nov 2019
When Choosing Plausible Alternatives, Clever Hans can be Clever
When Choosing Plausible Alternatives, Clever Hans can be Clever
Pride Kavumba
Naoya Inoue
Benjamin Heinzerling
Keshav Singh
Paul Reisert
Kentaro Inui
24
52
0
01 Nov 2019
Generalization through Memorization: Nearest Neighbor Language Models
Generalization through Memorization: Nearest Neighbor Language Models
Urvashi Khandelwal
Omer Levy
Dan Jurafsky
Luke Zettlemoyer
M. Lewis
RALM
91
823
0
01 Nov 2019
Adversarial NLI: A New Benchmark for Natural Language Understanding
Adversarial NLI: A New Benchmark for Natural Language Understanding
Yixin Nie
Adina Williams
Emily Dinan
Joey Tianyi Zhou
Jason Weston
Douwe Kiela
65
985
0
31 Oct 2019
Transfer Learning from Transformers to Fake News Challenge Stance
  Detection (FNC-1) Task
Transfer Learning from Transformers to Fake News Challenge Stance Detection (FNC-1) Task
Valeriya Slovikovskaya
24
41
0
31 Oct 2019
A neural document language modeling framework for spoken document
  retrieval
A neural document language modeling framework for spoken document retrieval
Li-Phen Yen
Zheng-Yu Wu
Kuan-Yu Chen
3DGS
27
0
0
31 Oct 2019
Towards Generalizable Neuro-Symbolic Systems for Commonsense Question
  Answering
Towards Generalizable Neuro-Symbolic Systems for Commonsense Question Answering
Kaixin Ma
Jonathan M Francis
Quanyang Lu
Eric Nyberg
A. Oltramari
NAI
26
89
0
30 Oct 2019
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language
  Generation, Translation, and Comprehension
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
M. Lewis
Yinhan Liu
Naman Goyal
Marjan Ghazvininejad
Abdel-rahman Mohamed
Omer Levy
Veselin Stoyanov
Luke Zettlemoyer
AIMat
VLM
41
10,670
0
29 Oct 2019
SpeechBERT: An Audio-and-text Jointly Learned Language Model for
  End-to-end Spoken Question Answering
SpeechBERT: An Audio-and-text Jointly Learned Language Model for End-to-end Spoken Question Answering
Yung-Sung Chuang
Chi-Liang Liu
Hung-yi Lee
Lin-shan Lee
AuLLM
35
39
0
25 Oct 2019
HUBERT Untangles BERT to Improve Transfer across NLP Tasks
HUBERT Untangles BERT to Improve Transfer across NLP Tasks
M. Moradshahi
Hamid Palangi
M. Lam
P. Smolensky
Jianfeng Gao
37
16
0
25 Oct 2019
Mockingjay: Unsupervised Speech Representation Learning with Deep
  Bidirectional Transformer Encoders
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
Andy T. Liu
Shu-Wen Yang
Po-Han Chi
Po-Chun Hsu
Hung-yi Lee
SSL
65
372
0
25 Oct 2019
An Empirical Study of Efficient ASR Rescoring with Transformers
An Empirical Study of Efficient ASR Rescoring with Transformers
Hongzhao Huang
Fuchun Peng
KELM
21
22
0
24 Oct 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
144
19,701
0
23 Oct 2019
Generative Pre-Training for Speech with Autoregressive Predictive Coding
Generative Pre-Training for Speech with Autoregressive Predictive Coding
Yu-An Chung
James R. Glass
SSL
38
174
0
23 Oct 2019
Improving Transformer-based Speech Recognition Using Unsupervised
  Pre-training
Improving Transformer-based Speech Recognition Using Unsupervised Pre-training
Dongwei Jiang
Xiaoning Lei
Wubo Li
Ne Luo
Yuxuan Hu
Wei Zou
Xiangang Li
29
99
0
22 Oct 2019
Findings of the NLP4IF-2019 Shared Task on Fine-Grained Propaganda
  Detection
Findings of the NLP4IF-2019 Shared Task on Fine-Grained Propaganda Detection
Giovanni Da San Martino
Alberto Barrón-Cedeño
Preslav Nakov
50
80
0
20 Oct 2019
Keyphrase Extraction from Scholarly Articles as Sequence Labeling using
  Contextualized Embeddings
Keyphrase Extraction from Scholarly Articles as Sequence Labeling using Contextualized Embeddings
Dhruva Sahrawat
Debanjan Mahata
Mayank Kulkarni
Haimin Zhang
Rakesh Gosangi
Amanda Stent
Agniv Sharma
Yaman Kumar Singla
R. Shah
Roger Zimmermann
17
30
0
19 Oct 2019
A Mutual Information Maximization Perspective of Language Representation
  Learning
A Mutual Information Maximization Perspective of Language Representation Learning
Lingpeng Kong
Cyprien de Masson dÁutume
Wang Ling
Lei Yu
Zihang Dai
Dani Yogatama
SSL
231
167
0
18 Oct 2019
Analyzing the Forgetting Problem in the Pretrain-Finetuning of Dialogue
  Response Models
Analyzing the Forgetting Problem in the Pretrain-Finetuning of Dialogue Response Models
Tianxing He
Jun Liu
Kyunghyun Cho
Myle Ott
Bing-Quan Liu
James R. Glass
Fuchun Peng
CLL
45
9
0
16 Oct 2019
Facebook AI's WAT19 Myanmar-English Translation Task Submission
Facebook AI's WAT19 Myanmar-English Translation Task Submission
Peng-Jen Chen
Jiajun Shen
Matt Le
Vishrav Chaudhary
Ahmed El-Kishky
Guillaume Wenzek
Myle Ott
MarcÁurelio Ranzato
25
29
0
15 Oct 2019
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations
Alexei Baevski
Steffen Schneider
Michael Auli
SSL
45
661
0
12 Oct 2019
On Empirical Comparisons of Optimizers for Deep Learning
On Empirical Comparisons of Optimizers for Deep Learning
Dami Choi
Christopher J. Shallue
Zachary Nado
Jaehoon Lee
Chris J. Maddison
George E. Dahl
46
256
0
11 Oct 2019
On the adequacy of untuned warmup for adaptive optimization
On the adequacy of untuned warmup for adaptive optimization
Jerry Ma
Denis Yarats
59
70
0
09 Oct 2019
PipeMare: Asynchronous Pipeline Parallel DNN Training
PipeMare: Asynchronous Pipeline Parallel DNN Training
Bowen Yang
Jian Zhang
Jonathan Li
Christopher Ré
Christopher R. Aberger
Christopher De Sa
33
110
0
09 Oct 2019
Knowledge Distillation from Internal Representations
Knowledge Distillation from Internal Representations
Gustavo Aguilar
Yuan Ling
Yu Zhang
Benjamin Yao
Xing Fan
Edward Guo
38
179
0
08 Oct 2019
BERT for Evidence Retrieval and Claim Verification
BERT for Evidence Retrieval and Claim Verification
Shrishti Saha Shetu
Christof Monz
E. Mabande
RALM
38
120
0
07 Oct 2019
Previous
123...184185186
Next