ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXiv (abs)PDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 10,677 papers shown
Title
Explicit Pairwise Word Interaction Modeling Improves Pretrained
  Transformers for English Semantic Similarity Tasks
Explicit Pairwise Word Interaction Modeling Improves Pretrained Transformers for English Semantic Similarity Tasks
Yinan Zhang
Raphael Tang
Jimmy J. Lin
16
5
0
07 Nov 2019
S2ORC: The Semantic Scholar Open Research Corpus
S2ORC: The Semantic Scholar Open Research Corpus
Kyle Lo
Lucy Lu Wang
Mark Neumann
Rodney Michael Kinney
Daniel S. Weld
OffRLAI4CE
93
10
0
07 Nov 2019
Towards Domain Adaptation from Limited Data for Question Answering Using
  Deep Neural Networks
Towards Domain Adaptation from Limited Data for Question Answering Using Deep Neural Networks
Timothy J. Hazen
Shehzaad Dhuliawala
Daniel Boies
OOD
60
19
0
06 Nov 2019
Dimensional Emotion Detection from Categorical Emotion
Dimensional Emotion Detection from Categorical Emotion
Sungjoon Park
Jiseon Kim
Seonghyeon Ye
J. Jeon
Heeyoung Park
Alice Oh
86
37
0
06 Nov 2019
Unsupervised Cross-lingual Representation Learning at Scale
Unsupervised Cross-lingual Representation Learning at Scale
Alexis Conneau
Kartikay Khandelwal
Naman Goyal
Vishrav Chaudhary
Guillaume Wenzek
Francisco Guzmán
Edouard Grave
Myle Ott
Luke Zettlemoyer
Veselin Stoyanov
230
6,614
0
05 Nov 2019
MML: Maximal Multiverse Learning for Robust Fine-Tuning of Language
  Models
MML: Maximal Multiverse Learning for Robust Fine-Tuning of Language Models
Itzik Malkiel
Lior Wolf
29
2
0
05 Nov 2019
Infusing Knowledge into the Textual Entailment Task Using Graph
  Convolutional Networks
Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks
Pavan Kapanipathi
Veronika Thost
S. Patel
Spencer Whitehead
Ibrahim Abdelaziz
...
R. Chulaka Gunasekara
B. Makni
Nicholas Mattei
Kartik Talamadupula
Achille Fokoue
122
45
0
05 Nov 2019
Deepening Hidden Representations from Pre-trained Language Models
Deepening Hidden Representations from Pre-trained Language Models
Junjie Yang
Hai Zhao
24
10
0
05 Nov 2019
BAS: An Answer Selection Method Using BERT Language Model
BAS: An Answer Selection Method Using BERT Language Model
Jamshid Mozafari
A. Fatemi
M. Nematbakhsh
45
17
0
04 Nov 2019
ZEN: Pre-training Chinese Text Encoder Enhanced by N-gram
  Representations
ZEN: Pre-training Chinese Text Encoder Enhanced by N-gram Representations
Shizhe Diao
Jiaxin Bai
Yan Song
Tong Zhang
Yonggang Wang
AI4CE
70
135
0
02 Nov 2019
Select, Answer and Explain: Interpretable Multi-hop Reading
  Comprehension over Multiple Documents
Select, Answer and Explain: Interpretable Multi-hop Reading Comprehension over Multiple Documents
Ming Tu
Kevin Huang
Guangtao Wang
Jing-ling Huang
Xiaodong He
Bowen Zhou
RALM
113
146
0
01 Nov 2019
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
Guillaume Wenzek
Marie-Anne Lachaux
Alexis Conneau
Vishrav Chaudhary
Francisco Guzmán
Armand Joulin
Edouard Grave
124
658
0
01 Nov 2019
When Choosing Plausible Alternatives, Clever Hans can be Clever
When Choosing Plausible Alternatives, Clever Hans can be Clever
Pride Kavumba
Naoya Inoue
Benjamin Heinzerling
Keshav Singh
Paul Reisert
Kentaro Inui
42
53
0
01 Nov 2019
Generalization through Memorization: Nearest Neighbor Language Models
Generalization through Memorization: Nearest Neighbor Language Models
Urvashi Khandelwal
Omer Levy
Dan Jurafsky
Luke Zettlemoyer
M. Lewis
RALM
185
846
0
01 Nov 2019
Adversarial NLI: A New Benchmark for Natural Language Understanding
Adversarial NLI: A New Benchmark for Natural Language Understanding
Yixin Nie
Adina Williams
Emily Dinan
Joey Tianyi Zhou
Jason Weston
Douwe Kiela
154
1,013
0
31 Oct 2019
Image-Conditioned Graph Generation for Road Network Extraction
Image-Conditioned Graph Generation for Road Network Extraction
Davide Belli
Thomas Kipf
GNN
55
40
0
31 Oct 2019
Transfer Learning from Transformers to Fake News Challenge Stance
  Detection (FNC-1) Task
Transfer Learning from Transformers to Fake News Challenge Stance Detection (FNC-1) Task
Valeriya Slovikovskaya
57
42
0
31 Oct 2019
A neural document language modeling framework for spoken document
  retrieval
A neural document language modeling framework for spoken document retrieval
Li-Phen Yen
Zheng-Yu Wu
Kuan-Yu Chen
3DGS
39
0
0
31 Oct 2019
Ensembling Strategies for Answering Natural Questions
Ensembling Strategies for Answering Natural Questions
Anthony Ferritto
Lin Pan
Rishav Chakravarti
Salim Roukos
Radu Florian
J. William Murdock
Avirup Sil
ELM
42
0
0
30 Oct 2019
Towards Generalizable Neuro-Symbolic Systems for Commonsense Question
  Answering
Towards Generalizable Neuro-Symbolic Systems for Commonsense Question Answering
Kaixin Ma
Jonathan M Francis
Quanyang Lu
Eric Nyberg
A. Oltramari
NAI
77
90
0
30 Oct 2019
Contextual Text Denoising with Masked Language Models
Contextual Text Denoising with Masked Language Models
Yifu Sun
Haoming Jiang
44
11
0
30 Oct 2019
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language
  Generation, Translation, and Comprehension
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
M. Lewis
Yinhan Liu
Naman Goyal
Marjan Ghazvininejad
Abdel-rahman Mohamed
Omer Levy
Veselin Stoyanov
Luke Zettlemoyer
AIMatVLM
268
10,897
0
29 Oct 2019
Training ASR models by Generation of Contextual Information
Training ASR models by Generation of Contextual Information
Kritika Singh
Dmytro Okhonko
Jun Liu
Yongqiang Wang
Frank Zhang
...
Sergey Edunov
Fuchun Peng
Yatharth Saraf
Geoffrey Zweig
Abdel-rahman Mohamed
61
7
0
27 Oct 2019
HUBERT Untangles BERT to Improve Transfer across NLP Tasks
HUBERT Untangles BERT to Improve Transfer across NLP Tasks
M. Moradshahi
Hamid Palangi
M. Lam
P. Smolensky
Jianfeng Gao
139
16
0
25 Oct 2019
Mockingjay: Unsupervised Speech Representation Learning with Deep
  Bidirectional Transformer Encoders
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
Andy T. Liu
Shu-Wen Yang
Po-Han Chi
Po-Chun Hsu
Hung-yi Lee
SSL
157
374
0
25 Oct 2019
Multi-Document Summarization with Determinantal Point Processes and
  Contextualized Representations
Multi-Document Summarization with Determinantal Point Processes and Contextualized Representations
Sangwoo Cho
Chen Li
Dong Yu
H. Foroosh
Fei Liu
66
17
0
24 Oct 2019
Emergent Properties of Finetuned Language Representation Models
Emergent Properties of Finetuned Language Representation Models
Alexandre Matton
Luke de Oliveira
SSL
40
1
0
23 Oct 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
547
20,397
0
23 Oct 2019
Generative Pre-Training for Speech with Autoregressive Predictive Coding
Generative Pre-Training for Speech with Autoregressive Predictive Coding
Yu-An Chung
James R. Glass
SSL
98
174
0
23 Oct 2019
Improving Transformer-based Speech Recognition Using Unsupervised
  Pre-training
Improving Transformer-based Speech Recognition Using Unsupervised Pre-training
Dongwei Jiang
Xiaoning Lei
Wubo Li
Ne Luo
Yuxuan Hu
Wei Zou
Xiangang Li
91
99
0
22 Oct 2019
Fine-grained Fact Verification with Kernel Graph Attention Network
Fine-grained Fact Verification with Kernel Graph Attention Network
Zhenghao Liu
Chenyan Xiong
Maosong Sun
Zhiyuan Liu
100
225
0
22 Oct 2019
Trouble with the Curve: Predicting Future MLB Players Using Scouting
  Reports
Trouble with the Curve: Predicting Future MLB Players Using Scouting Reports
Jacob Danovitch
17
2
0
21 Oct 2019
Findings of the NLP4IF-2019 Shared Task on Fine-Grained Propaganda
  Detection
Findings of the NLP4IF-2019 Shared Task on Fine-Grained Propaganda Detection
Giovanni Da San Martino
Alberto Barrón-Cedeño
Preslav Nakov
123
82
0
20 Oct 2019
Keyphrase Extraction from Scholarly Articles as Sequence Labeling using
  Contextualized Embeddings
Keyphrase Extraction from Scholarly Articles as Sequence Labeling using Contextualized Embeddings
Dhruva Sahrawat
Debanjan Mahata
Mayank Kulkarni
Haimin Zhang
Rakesh Gosangi
Amanda Stent
Agniv Sharma
Yaman Kumar Singla
R. Shah
Roger Zimmermann
35
30
0
19 Oct 2019
A Mutual Information Maximization Perspective of Language Representation
  Learning
A Mutual Information Maximization Perspective of Language Representation Learning
Lingpeng Kong
Cyprien de Masson dÁutume
Wang Ling
Lei Yu
Zihang Dai
Dani Yogatama
SSL
279
167
0
18 Oct 2019
BIG MOOD: Relating Transformers to Explicit Commonsense Knowledge
BIG MOOD: Relating Transformers to Explicit Commonsense Knowledge
Jeff Da
24
0
0
17 Oct 2019
BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized
  Model Performance
BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance
Timo Schick
Hinrich Schütze
83
50
0
16 Oct 2019
Facebook AI's WAT19 Myanmar-English Translation Task Submission
Facebook AI's WAT19 Myanmar-English Translation Task Submission
Peng-Jen Chen
Jiajun Shen
Matt Le
Vishrav Chaudhary
Ahmed El-Kishky
Guillaume Wenzek
Myle Ott
MarcÁurelio Ranzato
38
29
0
15 Oct 2019
Structured Pruning of a BERT-based Question Answering Model
Structured Pruning of a BERT-based Question Answering Model
J. Scott McCarley
Rishav Chakravarti
Avirup Sil
94
53
0
14 Oct 2019
VAIS Hate Speech Detection System: A Deep Learning based Approach for
  System Combination
VAIS Hate Speech Detection System: A Deep Learning based Approach for System Combination
Thai-Binh Nguyen
Quang Minh Nguyen
T. Nguyen
Ngoc Phuong Pham
The-Loc Nguyen
Quoc Truong Do
42
10
0
12 Oct 2019
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations
Alexei Baevski
Steffen Schneider
Michael Auli
SSL
181
667
0
12 Oct 2019
On Empirical Comparisons of Optimizers for Deep Learning
On Empirical Comparisons of Optimizers for Deep Learning
Dami Choi
Christopher J. Shallue
Zachary Nado
Jaehoon Lee
Chris J. Maddison
George E. Dahl
118
259
0
11 Oct 2019
exBERT: A Visual Analysis Tool to Explore Learned Representations in
  Transformers Models
exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformers Models
Benjamin Hoover
Hendrik Strobelt
Sebastian Gehrmann
40
86
0
11 Oct 2019
Structured Pruning of Large Language Models
Structured Pruning of Large Language Models
Ziheng Wang
Jeremy Wohlwend
Tao Lei
85
293
0
10 Oct 2019
On the adequacy of untuned warmup for adaptive optimization
On the adequacy of untuned warmup for adaptive optimization
Jerry Ma
Denis Yarats
106
70
0
09 Oct 2019
PipeMare: Asynchronous Pipeline Parallel DNN Training
PipeMare: Asynchronous Pipeline Parallel DNN Training
Bowen Yang
Jian Zhang
Jonathan Li
Christopher Ré
Christopher R. Aberger
Christopher De Sa
77
114
0
09 Oct 2019
Knowledge Distillation from Internal Representations
Knowledge Distillation from Internal Representations
Gustavo Aguilar
Yuan Ling
Yu Zhang
Benjamin Yao
Xing Fan
Edward Guo
96
181
0
08 Oct 2019
BERT for Evidence Retrieval and Claim Verification
BERT for Evidence Retrieval and Claim Verification
Shrishti Saha Shetu
Christof Monz
E. Mabande
RALM
80
126
0
07 Oct 2019
Checkmate: Breaking the Memory Wall with Optimal Tensor
  Rematerialization
Checkmate: Breaking the Memory Wall with Optimal Tensor Rematerialization
Paras Jain
Ajay Jain
Aniruddha Nrusimha
A. Gholami
Pieter Abbeel
Kurt Keutzer
Ion Stoica
Joseph E. Gonzalez
98
197
0
07 Oct 2019
Multi-hop Question Answering via Reasoning Chains
Multi-hop Question Answering via Reasoning Chains
Jifan Chen
Shih-Ting Lin
Greg Durrett
ReLMLRM
85
74
0
07 Oct 2019
Previous
123...211212213214
Next