ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXivPDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 4,801 papers shown
Title
Question Answering Infused Pre-training of General-Purpose
  Contextualized Representations
Question Answering Infused Pre-training of General-Purpose Contextualized Representations
Robin Jia
M. Lewis
Luke Zettlemoyer
23
28
0
15 Jun 2021
CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
Ningyu Zhang
Mosha Chen
Zhen Bi
Xiaozhuan Liang
Lei Li
...
Jun Yan
Hongying Zan
Kunli Zhang
Buzhou Tang
Qingcai Chen
LM&MA
ELM
34
179
0
15 Jun 2021
The Possible, the Plausible, and the Desirable: Event-Based Modality
  Detection for Language Processing
The Possible, the Plausible, and the Desirable: Event-Based Modality Detection for Language Processing
Valentina Pyatkin
Shoval Sadde
Aynat Rubinstein
P. Portner
Reut Tsarfaty
29
18
0
15 Jun 2021
Improving Paraphrase Detection with the Adversarial Paraphrasing Task
Improving Paraphrase Detection with the Adversarial Paraphrasing Task
Animesh Nighojkar
John Licato
25
39
0
14 Jun 2021
Evaluating Various Tokenizers for Arabic Text Classification
Evaluating Various Tokenizers for Arabic Text Classification
Zaid Alyafeai
Maged S. Al-Shaibani
Mustafa Ghaleb
Irfan Ahmad
42
41
0
14 Jun 2021
Cascaded Span Extraction and Response Generation for Document-Grounded
  Dialog
Cascaded Span Extraction and Response Generation for Document-Grounded Dialog
Nico Daheim
David Thulke
Christian Dugast
Hermann Ney
29
11
0
14 Jun 2021
SAS: Self-Augmentation Strategy for Language Model Pre-training
SAS: Self-Augmentation Strategy for Language Model Pre-training
Yifei Xu
Jingqiao Zhang
Ru He
Liangzhu Ge
Chao Yang
Cheng Yang
Ying Wu
42
1
0
14 Jun 2021
Examining and Combating Spurious Features under Distribution Shift
Examining and Combating Spurious Features under Distribution Shift
Chunting Zhou
Xuezhe Ma
Paul Michel
Graham Neubig
OOD
37
67
0
14 Jun 2021
Pre-Trained Models: Past, Present and Future
Pre-Trained Models: Past, Present and Future
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFin
MQ
AI4MH
60
818
0
14 Jun 2021
InfoBehavior: Self-supervised Representation Learning for Ultra-long
  Behavior Sequence via Hierarchical Grouping
InfoBehavior: Self-supervised Representation Learning for Ultra-long Behavior Sequence via Hierarchical Grouping
Runshi Liu
Pengda Qin
Yuhong Li
Weigao Wen
Dong Li
Kefeng Deng
Qiang Wu
AI4TS
15
0
0
13 Jun 2021
Can Transformer Language Models Predict Psychometric Properties?
Can Transformer Language Models Predict Psychometric Properties?
Antonio Laverghetta
Animesh Nighojkar
Jamshidbek Mirzakhalov
John Licato
LM&MA
38
14
0
12 Jun 2021
Leveraging Pre-trained Language Model for Speech Sentiment Analysis
Leveraging Pre-trained Language Model for Speech Sentiment Analysis
Suwon Shon
Pablo Brusco
Jing Pan
Kyu Jeong Han
Shinji Watanabe
17
16
0
11 Jun 2021
BoB: BERT Over BERT for Training Persona-based Dialogue Models from
  Limited Personalized Data
BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data
Haoyu Song
Yan Wang
Kaiyan Zhang
Weinan Zhang
Ting Liu
30
117
0
11 Jun 2021
Generate, Annotate, and Learn: NLP with Synthetic Text
Generate, Annotate, and Learn: NLP with Synthetic Text
Xuanli He
Islam Nassar
J. Kiros
Gholamreza Haffari
Mohammad Norouzi
44
51
0
11 Jun 2021
Assessing Political Prudence of Open-domain Chatbots
Assessing Political Prudence of Open-domain Chatbots
Yejin Bang
Nayeon Lee
Etsuko Ishii
Andrea Madotto
Pascale Fung
29
24
0
11 Jun 2021
CAT: Cross Attention in Vision Transformer
CAT: Cross Attention in Vision Transformer
Hezheng Lin
Xingyi Cheng
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Qing Song
Wei Yuan
ViT
35
149
0
10 Jun 2021
Programming Puzzles
Programming Puzzles
Tal Schuster
Ashwin Kalyan
Oleksandr Polozov
Adam Tauman Kalai
ELM
22
32
0
10 Jun 2021
Linguistically Informed Masking for Representation Learning in the
  Patent Domain
Linguistically Informed Masking for Representation Learning in the Patent Domain
Sophia Althammer
Mark Buckley
Sebastian Hofstatter
Allan Hanbury
45
11
0
10 Jun 2021
SemEval-2021 Task 11: NLPContributionGraph -- Structuring Scholarly NLP
  Contributions for a Research Knowledge Graph
SemEval-2021 Task 11: NLPContributionGraph -- Structuring Scholarly NLP Contributions for a Research Knowledge Graph
Jennifer D'Souza
Sören Auer
Ted Pedersen
46
30
0
10 Jun 2021
MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training
MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training
Mingliang Zeng
Xu Tan
Rui Wang
Zeqian Ju
Tao Qin
Tie-Yan Liu
22
129
0
10 Jun 2021
How Robust are Model Rankings: A Leaderboard Customization Approach for
  Equitable Evaluation
How Robust are Model Rankings: A Leaderboard Customization Approach for Equitable Evaluation
Swaroop Mishra
Anjana Arunkumar
34
24
0
10 Jun 2021
Semantic-aware Binary Code Representation with BERT
Semantic-aware Binary Code Representation with BERT
Hyungjoon Koo
Soyeon Park
Daejin Choi
Taesoo Kim
27
23
0
10 Jun 2021
Artificial Intelligence in Drug Discovery: Applications and Techniques
Artificial Intelligence in Drug Discovery: Applications and Techniques
Jianyuan Deng
Zhibo Yang
Iwao Ojima
Dimitris Samaras
Fusheng Wang
AI4TS
42
100
0
09 Jun 2021
Do Transformers Really Perform Bad for Graph Representation?
Do Transformers Really Perform Bad for Graph Representation?
Chengxuan Ying
Tianle Cai
Shengjie Luo
Shuxin Zheng
Guolin Ke
Di He
Yanming Shen
Tie-Yan Liu
GNN
53
436
0
09 Jun 2021
Instantaneous Grammatical Error Correction with Shallow Aggressive
  Decoding
Instantaneous Grammatical Error Correction with Shallow Aggressive Decoding
Xin Sun
Tao Ge
Furu Wei
Houfeng Wang
25
62
0
09 Jun 2021
On Sample Based Explanation Methods for NLP:Efficiency, Faithfulness,
  and Semantic Evaluation
On Sample Based Explanation Methods for NLP:Efficiency, Faithfulness, and Semantic Evaluation
Wei Zhang
Ziming Huang
Yada Zhu
Guangnan Ye
Xiaodong Cui
Fan Zhang
36
17
0
09 Jun 2021
Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
Rabeeh Karimi Mahabadi
James Henderson
Sebastian Ruder
MoE
67
469
0
08 Jun 2021
On the Lack of Robust Interpretability of Neural Text Classifiers
On the Lack of Robust Interpretability of Neural Text Classifiers
Muhammad Bilal Zafar
Michele Donini
Dylan Slack
Cédric Archambeau
Sanjiv Ranjan Das
K. Kenthapadi
AAML
16
21
0
08 Jun 2021
TIMEDIAL: Temporal Commonsense Reasoning in Dialog
TIMEDIAL: Temporal Commonsense Reasoning in Dialog
Lianhui Qin
Aditya Gupta
Shyam Upadhyay
Luheng He
Yejin Choi
Manaal Faruqui
LRM
31
65
0
08 Jun 2021
XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation
XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation
Subhabrata Mukherjee
Ahmed Hassan Awadallah
Jianfeng Gao
19
22
0
08 Jun 2021
A Survey of Transformers
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
58
1,089
0
08 Jun 2021
One Semantic Parser to Parse Them All: Sequence to Sequence Multi-Task
  Learning on Semantic Parsing Datasets
One Semantic Parser to Parse Them All: Sequence to Sequence Multi-Task Learning on Semantic Parsing Datasets
Marco Damonte
Emilio Monti
AIMat
38
6
0
08 Jun 2021
Giving Commands to a Self-Driving Car: How to Deal with Uncertain
  Situations?
Giving Commands to a Self-Driving Car: How to Deal with Uncertain Situations?
Thierry Deruyttere
Victor Milewski
Marie-Francine Moens
30
15
0
08 Jun 2021
PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for
  Reinforcement Learning
PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning
Tao Yu
Cuiling Lan
Wenjun Zeng
Mingxiao Feng
Zhizheng Zhang
Zhibo Chen
OffRL
25
46
0
08 Jun 2021
Measuring and Improving BERT's Mathematical Abilities by Predicting the
  Order of Reasoning
Measuring and Improving BERT's Mathematical Abilities by Predicting the Order of Reasoning
Piotr Pikekos
Henryk Michalewski
Mateusz Malinowski
35
28
0
07 Jun 2021
Hierarchical Task Learning from Language Instructions with Unified
  Transformers and Self-Monitoring
Hierarchical Task Learning from Language Instructions with Unified Transformers and Self-Monitoring
Yichi Zhang
J. Chai
25
78
0
07 Jun 2021
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
ViT
65
330
0
07 Jun 2021
SelfDoc: Self-Supervised Document Representation Learning
SelfDoc: Self-Supervised Document Representation Learning
Peizhao Li
Jiuxiang Gu
Jason Kuen
Vlad I. Morariu
Handong Zhao
R. Jain
Varun Manjunatha
Hongfu Liu
ViT
SSL
28
160
0
07 Jun 2021
Referring Transformer: A One-step Approach to Multi-task Visual
  Grounding
Referring Transformer: A One-step Approach to Multi-task Visual Grounding
Muchen Li
Leonid Sigal
ObjD
13
189
0
06 Jun 2021
Empowering Language Understanding with Counterfactual Reasoning
Empowering Language Understanding with Counterfactual Reasoning
Fuli Feng
Jizhi Zhang
Xiangnan He
Hanwang Zhang
Tat-Seng Chua
LRM
21
33
0
06 Jun 2021
MERLOT: Multimodal Neural Script Knowledge Models
MERLOT: Multimodal Neural Script Knowledge Models
Rowan Zellers
Ximing Lu
Jack Hessel
Youngjae Yu
J. S. Park
Jize Cao
Ali Farhadi
Yejin Choi
VLM
LRM
38
374
0
04 Jun 2021
Do Syntactic Probes Probe Syntax? Experiments with Jabberwocky Probing
Do Syntactic Probes Probe Syntax? Experiments with Jabberwocky Probing
Rowan Hall Maudslay
Ryan Cotterell
31
33
0
04 Jun 2021
cs60075_team2 at SemEval-2021 Task 1 : Lexical Complexity Prediction
  using Transformer-based Language Models pre-trained on various text corpora
cs60075_team2 at SemEval-2021 Task 1 : Lexical Complexity Prediction using Transformer-based Language Models pre-trained on various text corpora
Abhilash Nandy
Sayantan Adak
Tanurima Halder
Sai Mahesh Pokala
18
6
0
04 Jun 2021
ERNIE-Tiny : A Progressive Distillation Framework for Pretrained
  Transformer Compression
ERNIE-Tiny : A Progressive Distillation Framework for Pretrained Transformer Compression
Weiyue Su
Xuyi Chen
Shi Feng
Jiaxiang Liu
Weixin Liu
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
34
13
0
04 Jun 2021
Self-supervised Dialogue Learning for Spoken Conversational Question
  Answering
Self-supervised Dialogue Learning for Spoken Conversational Question Answering
Nuo Chen
Chenyu You
Yuexian Zou
SSL
28
33
0
04 Jun 2021
The Case for Translation-Invariant Self-Attention in Transformer-Based
  Language Models
The Case for Translation-Invariant Self-Attention in Transformer-Based Language Models
Ulme Wennberg
G. Henter
MILM
37
21
0
03 Jun 2021
Defending Against Backdoor Attacks in Natural Language Generation
Defending Against Backdoor Attacks in Natural Language Generation
Xiaofei Sun
Xiaoya Li
Yuxian Meng
Xiang Ao
Fei Wu
Jiwei Li
Tianwei Zhang
AAML
SILM
33
47
0
03 Jun 2021
Reordering Examples Helps during Priming-based Few-Shot Learning
Reordering Examples Helps during Priming-based Few-Shot Learning
Sawan Kumar
Partha P. Talukdar
20
58
0
03 Jun 2021
Self-Guided Contrastive Learning for BERT Sentence Representations
Self-Guided Contrastive Learning for BERT Sentence Representations
Taeuk Kim
Kang Min Yoo
Sang-goo Lee
SSL
39
202
0
03 Jun 2021
Can Generative Pre-trained Language Models Serve as Knowledge Bases for
  Closed-book QA?
Can Generative Pre-trained Language Models Serve as Knowledge Bases for Closed-book QA?
Cunxiang Wang
Pai Liu
Yue Zhang
RALM
42
80
0
03 Jun 2021
Previous
123...767778...959697
Next