ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXivPDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 4,659 papers shown
Title
Fighting the COVID-19 Infodemic in Social Media: A Holistic Perspective
  and a Call to Arms
Fighting the COVID-19 Infodemic in Social Media: A Holistic Perspective and a Call to Arms
Firoj Alam
Fahim Dalvi
Shaden Shaar
Nadir Durrani
Hamdy Mubarak
...
Giovanni Da San Martino
Ahmed Abdelali
Hassan Sajjad
Kareem Darwish
Preslav Nakov
22
102
0
15 Jul 2020
Fine-Tune Longformer for Jointly Predicting Rumor Stance and Veracity
Fine-Tune Longformer for Jointly Predicting Rumor Stance and Veracity
Anant Khandelwal
21
22
0
15 Jul 2020
Deep learning models for representing out-of-vocabulary words
Deep learning models for representing out-of-vocabulary words
Johannes V. Lochter
Renato M. Silva
Tiago A. Almeida
18
15
0
14 Jul 2020
An Empirical Study on Robustness to Spurious Correlations using
  Pre-trained Language Models
An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models
Lifu Tu
Garima Lalwani
Spandana Gella
He He
LRM
33
184
0
14 Jul 2020
Learning Reasoning Strategies in End-to-End Differentiable Proving
Learning Reasoning Strategies in End-to-End Differentiable Proving
Pasquale Minervini
Sebastian Riedel
Pontus Stenetorp
Edward Grefenstette
Tim Rocktaschel
LRM
45
96
0
13 Jul 2020
TERA: Self-Supervised Learning of Transformer Encoder Representation for
  Speech
TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech
Andy T. Liu
Shang-Wen Li
Hung-yi Lee
SSL
62
356
0
12 Jul 2020
Unsupervised Text Generation by Learning from Search
Unsupervised Text Generation by Learning from Search
Jingjing Li
Zichao Li
Lili Mou
Xin Jiang
Michael R. Lyu
Irwin King
27
55
0
09 Jul 2020
Learning Speech Representations from Raw Audio by Joint Audiovisual
  Self-Supervision
Learning Speech Representations from Raw Audio by Joint Audiovisual Self-Supervision
Abhinav Shukla
Stavros Petridis
Maja Pantic
SSL
32
16
0
08 Jul 2020
Robust Prediction of Punctuation and Truecasing for Medical ASR
Robust Prediction of Punctuation and Truecasing for Medical ASR
Monica Sunkara
S. Ronanki
Kalpit Dixit
S. Bodapati
Katrin Kirchhoff
14
33
0
04 Jul 2020
Approximate Nearest Neighbor Negative Contrastive Learning for Dense
  Text Retrieval
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
Lee Xiong
Chenyan Xiong
Ye Li
Kwok-Fung Tang
Jialin Liu
Paul N. Bennett
Junaid Ahmed
Arnold Overwijk
22
1,181
0
01 Jul 2020
SemEval-2020 Task 4: Commonsense Validation and Explanation
SemEval-2020 Task 4: Commonsense Validation and Explanation
Cunxiang Wang
Shuailong Liang
Yili Jin
Yilong Wang
Xiao-Dan Zhu
Yue Zhang
LRM
25
98
0
01 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers
Data Movement Is All You Need: A Case Study on Optimizing Transformers
A. Ivanov
Nikoli Dryden
Tal Ben-Nun
Shigang Li
Torsten Hoefler
36
131
0
30 Jun 2020
Learning Sparse Prototypes for Text Generation
Learning Sparse Prototypes for Text Generation
Junxian He
Taylor Berg-Kirkpatrick
Graham Neubig
27
23
0
29 Jun 2020
Improving Sequence Tagging for Vietnamese Text Using Transformer-based
  Neural Models
Improving Sequence Tagging for Vietnamese Text Using Transformer-based Neural Models
Viet The Bui
Oanh T. K. Tran
Hong Phuong Le
25
38
0
29 Jun 2020
Video-Grounded Dialogues with Pretrained Generation Language Models
Video-Grounded Dialogues with Pretrained Generation Language Models
Hung Le
Guosheng Lin
34
28
0
27 Jun 2020
Pre-training via Paraphrasing
Pre-training via Paraphrasing
M. Lewis
Marjan Ghazvininejad
Gargi Ghosh
Armen Aghajanyan
Sida I. Wang
Luke Zettlemoyer
AIMat
30
159
0
26 Jun 2020
Evaluation of Text Generation: A Survey
Evaluation of Text Generation: A Survey
Asli Celikyilmaz
Elizabeth Clark
Jianfeng Gao
ELM
LM&MA
19
378
0
26 Jun 2020
Does the Whole Exceed its Parts? The Effect of AI Explanations on
  Complementary Team Performance
Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance
Gagan Bansal
Tongshuang Wu
Joyce Zhou
Raymond Fok
Besmira Nushi
Ece Kamar
Marco Tulio Ribeiro
Daniel S. Weld
42
583
0
26 Jun 2020
The Depth-to-Width Interplay in Self-Attention
The Depth-to-Width Interplay in Self-Attention
Yoav Levine
Noam Wies
Or Sharir
Hofit Bata
Amnon Shashua
30
45
0
22 Jun 2020
Open-Domain Conversational Agents: Current Progress, Open Problems, and
  Future Directions
Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions
Stephen Roller
Y-Lan Boureau
Jason Weston
Antoine Bordes
Emily Dinan
...
Kurt Shuster
Eric Michael Smith
Arthur Szlam
Jack Urbanek
Mary Williamson
LLMAG
AI4CE
28
51
0
22 Jun 2020
MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of
  Gradients
MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of Gradients
Chenfei Zhu
Yu Cheng
Zhe Gan
Furong Huang
Jingjing Liu
Tom Goldstein
ODL
35
2
0
21 Jun 2020
A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics and
  Benchmark Datasets
A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics and Benchmark Datasets
Chengchang Zeng
Shaobo Li
Qin Li
Jie Hu
Jianjun Hu
16
101
0
21 Jun 2020
Cross-lingual Retrieval for Iterative Self-Supervised Training
Cross-lingual Retrieval for Iterative Self-Supervised Training
C. Tran
Y. Tang
Xian Li
Jiatao Gu
RALM
28
73
0
16 Jun 2020
On the Computational Power of Transformers and its Implications in
  Sequence Modeling
On the Computational Power of Transformers and its Implications in Sequence Modeling
S. Bhattamishra
Arkil Patel
Navin Goyal
33
65
0
16 Jun 2020
PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized
  Embedding Models
PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized Embedding Models
Eyal Ben-David
Carmel Rabinovitz
Roi Reichart
SSL
58
61
0
16 Jun 2020
Minimum Width for Universal Approximation
Minimum Width for Universal Approximation
Sejun Park
Chulhee Yun
Jaeho Lee
Jinwoo Shin
35
122
0
16 Jun 2020
To Pretrain or Not to Pretrain: Examining the Benefits of Pretraining on
  Resource Rich Tasks
To Pretrain or Not to Pretrain: Examining the Benefits of Pretraining on Resource Rich Tasks
Sinong Wang
Madian Khabsa
Hao Ma
18
26
0
15 Jun 2020
Self-supervised Learning: Generative or Contrastive
Self-supervised Learning: Generative or Contrastive
Xiao Liu
Fanjin Zhang
Zhenyu Hou
Zhaoyu Wang
Li Mian
Jing Zhang
Jie Tang
SSL
52
1,587
0
15 Jun 2020
Comparing Natural Language Processing Techniques for Alzheimer's
  Dementia Prediction in Spontaneous Speech
Comparing Natural Language Processing Techniques for Alzheimer's Dementia Prediction in Spontaneous Speech
Thomas Searle
Zina M. Ibrahim
Richard J. B. Dobson
6
46
0
12 Jun 2020
SemEval-2020 Task 12: Multilingual Offensive Language Identification in
  Social Media (OffensEval 2020)
SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)
Marcos Zampieri
Preslav Nakov
Sara Rosenthal
Pepa Atanasova
Georgi Karadzhov
Hamdy Mubarak
Leon Derczynski
Zeses Pitenis
cCaugri cColtekin
30
483
0
12 Jun 2020
Rethinking Pre-training and Self-training
Rethinking Pre-training and Self-training
Barret Zoph
Golnaz Ghiasi
Nayeon Lee
Huayu Chen
Hanxiao Liu
E. D. Cubuk
Quoc V. Le
SSeg
48
646
0
11 Jun 2020
VirTex: Learning Visual Representations from Textual Annotations
VirTex: Learning Visual Representations from Textual Annotations
Karan Desai
Justin Johnson
SSL
VLM
30
433
0
11 Jun 2020
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason
  Over Implicit Knowledge
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge
Alon Talmor
Oyvind Tafjord
Peter Clark
Yoav Goldberg
Jonathan Berant
ReLM
LRM
36
39
0
11 Jun 2020
CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot
  Cross-Lingual NLP
CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot Cross-Lingual NLP
Libo Qin
Minheng Ni
Yue Zhang
Wanxiang Che
40
149
0
11 Jun 2020
A Monolingual Approach to Contextualized Word Embeddings for
  Mid-Resource Languages
A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages
Pedro Ortiz Suarez
Laurent Romary
Benoît Sagot
28
227
0
11 Jun 2020
Report from the NSF Future Directions Workshop, Toward User-Oriented
  Agents: Research Directions and Challenges
Report from the NSF Future Directions Workshop, Toward User-Oriented Agents: Research Directions and Challenges
M. Eskénazi
Tiancheng Zhao
LLMAG
AI4TS
AI4CE
36
9
0
10 Jun 2020
Revisiting Few-sample BERT Fine-tuning
Revisiting Few-sample BERT Fine-tuning
Tianyi Zhang
Felix Wu
Arzoo Katiyar
Kilian Q. Weinberger
Yoav Artzi
41
442
0
10 Jun 2020
Combination of abstractive and extractive approaches for summarization
  of long scientific texts
Combination of abstractive and extractive approaches for summarization of long scientific texts
Vladislav Tretyak
Denis Stepanov
21
10
0
09 Jun 2020
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and
  Strong Baselines
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines
Marius Mosbach
Maksym Andriushchenko
Dietrich Klakow
31
354
0
08 Jun 2020
Linformer: Self-Attention with Linear Complexity
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
84
1,655
0
08 Jun 2020
An Overview of Neural Network Compression
An Overview of Neural Network Compression
James OÑeill
AI4CE
45
98
0
05 Jun 2020
DeCLUTR: Deep Contrastive Learning for Unsupervised Textual
  Representations
DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations
John Giorgi
Osvald Nitski
Bo Wang
Gary D. Bader
SSL
39
490
0
05 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
64
2,626
0
05 Jun 2020
Sponge Examples: Energy-Latency Attacks on Neural Networks
Sponge Examples: Energy-Latency Attacks on Neural Networks
Ilia Shumailov
Yiren Zhao
Daniel Bates
Nicolas Papernot
Robert D. Mullins
Ross J. Anderson
SILM
19
127
0
05 Jun 2020
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient
  Language Processing
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
Zihang Dai
Guokun Lai
Yiming Yang
Quoc V. Le
48
230
0
05 Jun 2020
A Survey on Transfer Learning in Natural Language Processing
A Survey on Transfer Learning in Natural Language Processing
Zaid Alyafeai
Maged S. Alshaibani
Irfan Ahmad
30
72
0
31 May 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
109
40,302
0
28 May 2020
Syntactic Structure Distillation Pretraining For Bidirectional Encoders
Syntactic Structure Distillation Pretraining For Bidirectional Encoders
A. Kuncoro
Lingpeng Kong
Daniel Fried
Dani Yogatama
Laura Rimell
Chris Dyer
Phil Blunsom
31
33
0
27 May 2020
CausaLM: Causal Model Explanation Through Counterfactual Language Models
CausaLM: Causal Model Explanation Through Counterfactual Language Models
Amir Feder
Nadav Oved
Uri Shalit
Roi Reichart
CML
LRM
44
157
0
27 May 2020
Rationalizing Text Matching: Learning Sparse Alignments via Optimal
  Transport
Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport
Kyle Swanson
L. Yu
Tao Lei
OT
29
37
0
27 May 2020
Previous
123...878889...929394
Next