ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXivPDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 4,752 papers shown
Title
TUDublin team at Constraint@AAAI2021 -- COVID19 Fake News Detection
TUDublin team at Constraint@AAAI2021 -- COVID19 Fake News Detection
Elena Shushkevich
J. Cardiff
8
12
0
14 Jan 2021
Of Non-Linearity and Commutativity in BERT
Of Non-Linearity and Commutativity in BERT
Sumu Zhao
Damian Pascual
Gino Brunner
Roger Wattenhofer
36
16
0
12 Jan 2021
Transforming Multi-Conditioned Generation from Meaning Representation
Transforming Multi-Conditioned Generation from Meaning Representation
Joosung Lee
19
3
0
12 Jan 2021
Model Generalization on COVID-19 Fake News Detection
Model Generalization on COVID-19 Fake News Detection
Yejin Bang
Etsuko Ishii
Samuel Cahyawijaya
Ziwei Ji
Pascale Fung
55
36
0
11 Jan 2021
Cisco at AAAI-CAD21 shared task: Predicting Emphasis in Presentation
  Slides using Contextualized Embeddings
Cisco at AAAI-CAD21 shared task: Predicting Emphasis in Presentation Slides using Contextualized Embeddings
Sreyan Ghosh
Sonal Kumar
H. Jalan
Hemant Yadav
R. Shah
39
2
0
10 Jan 2021
LightXML: Transformer with Dynamic Negative Sampling for
  High-Performance Extreme Multi-label Text Classification
LightXML: Transformer with Dynamic Negative Sampling for High-Performance Extreme Multi-label Text Classification
Ting Jiang
Deqing Wang
Leilei Sun
Huayi Yang
Zhengyang Zhao
Fuzhen Zhuang
VLM
128
136
0
09 Jan 2021
Trankit: A Light-Weight Transformer-based Toolkit for Multilingual
  Natural Language Processing
Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing
Minh Nguyen
Viet Dac Lai
Amir Pouran Ben Veyseh
Thien Huu Nguyen
52
132
0
09 Jan 2021
Exploring Text-transformers in AAAI 2021 Shared Task: COVID-19 Fake News
  Detection in English
Exploring Text-transformers in AAAI 2021 Shared Task: COVID-19 Fake News Detection in English
Xiangyang Li
Yu Xia
Xiang Long
Zheng Li
Sujian Li
226
37
0
07 Jan 2021
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit
  Reasoning Strategies
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
Mor Geva
Daniel Khashabi
Elad Segal
Tushar Khot
Dan Roth
Jonathan Berant
RALM
259
682
0
06 Jan 2021
Deep Neural Network Based Relation Extraction: An Overview
Deep Neural Network Based Relation Extraction: An Overview
Hailin Wang
Ke Qin
R. Zakari
Guoming Lu
Jin Yin
63
64
0
06 Jan 2021
EfficientQA : a RoBERTa Based Phrase-Indexed Question-Answering System
EfficientQA : a RoBERTa Based Phrase-Indexed Question-Answering System
Sofian Chaybouti
Achraf Saghe
A. Shabou
RALM
45
8
0
06 Jan 2021
I-BERT: Integer-only BERT Quantization
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
107
345
0
05 Jan 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
233
2,434
0
04 Jan 2021
Few-Shot Question Answering by Pretraining Span Selection
Few-Shot Question Answering by Pretraining Span Selection
Ori Ram
Yuval Kirstain
Jonathan Berant
Amir Globerson
Omer Levy
36
97
0
02 Jan 2021
Learning to Generate Task-Specific Adapters from Task Description
Learning to Generate Task-Specific Adapters from Task Description
Qinyuan Ye
Xiang Ren
117
29
0
02 Jan 2021
KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense
  Generation
KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation
Yiran Xing
Z. Shi
Zhao Meng
Gerhard Lakemeyer
Yunpu Ma
Roger Wattenhofer
VLM
72
40
0
02 Jan 2021
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
Wangchunshu Zhou
Tao Ge
Canwen Xu
Ke Xu
Furu Wei
LRM
16
15
0
02 Jan 2021
Learning to Emphasize: Dataset and Shared Task Models for Selecting
  Emphasis in Presentation Slides
Learning to Emphasize: Dataset and Shared Task Models for Selecting Emphasis in Presentation Slides
Amirreza Shirani
Gia-Lac Tran
Hieu Trinh
Franck Dernoncourt
Nedim Lipka
P. Asente
J. Echevarria
Thamar Solorio
190
1
0
02 Jan 2021
Modeling Fine-Grained Entity Types with Box Embeddings
Modeling Fine-Grained Entity Types with Box Embeddings
Yasumasa Onoe
Michael Boratko
Andrew McCallum
Greg Durrett
OCL
43
67
0
02 Jan 2021
Analyzing Commonsense Emergence in Few-shot Knowledge Models
Analyzing Commonsense Emergence in Few-shot Knowledge Models
Jeff Da
Ronan Le Bras
Ximing Lu
Yejin Choi
Antoine Bosselut
AI4MH
KELM
77
40
0
01 Jan 2021
Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and
  Improving Models
Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models
Tongshuang Wu
Marco Tulio Ribeiro
Jeffrey Heer
Daniel S. Weld
48
244
0
01 Jan 2021
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Xiang Lisa Li
Percy Liang
49
4,103
0
01 Jan 2021
How Do Your Biomedical Named Entity Recognition Models Generalize to
  Novel Entities?
How Do Your Biomedical Named Entity Recognition Models Generalize to Novel Entities?
Hyunjae Kim
Jaewoo Kang
AI4CE
94
21
0
01 Jan 2021
Intent Classification and Slot Filling for Privacy Policies
Intent Classification and Slot Filling for Privacy Policies
Wasi Uddin Ahmad
Jianfeng Chi
Tu Le
Thomas B. Norton
Yuan Tian
Kai-Wei Chang
21
23
0
01 Jan 2021
WARP: Word-level Adversarial ReProgramming
WARP: Word-level Adversarial ReProgramming
Karen Hambardzumyan
Hrant Khachatrian
Jonathan May
AAML
254
342
0
01 Jan 2021
Multi-task Retrieval for Knowledge-Intensive Tasks
Multi-task Retrieval for Knowledge-Intensive Tasks
Jean Maillard
Vladimir Karpukhin
Fabio Petroni
Wen-tau Yih
Barlas Oğuz
Veselin Stoyanov
Gargi Ghosh
215
64
0
01 Jan 2021
Studying Strategically: Learning to Mask for Closed-book QA
Studying Strategically: Learning to Mask for Closed-book QA
Qinyuan Ye
Belinda Z. Li
Sinong Wang
Benjamin Bolte
Hao Ma
Wen-tau Yih
Xiang Ren
Madian Khabsa
OffRL
27
11
0
31 Dec 2020
MiniLMv2: Multi-Head Self-Attention Relation Distillation for
  Compressing Pretrained Transformers
MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers
Wenhui Wang
Hangbo Bao
Shaohan Huang
Li Dong
Furu Wei
MQ
30
257
0
31 Dec 2020
Learning from the Worst: Dynamically Generated Datasets to Improve
  Online Hate Detection
Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection
Bertie Vidgen
Tristan Thrush
Zeerak Talat
Douwe Kiela
34
245
0
31 Dec 2020
Making Pre-trained Language Models Better Few-shot Learners
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao
Adam Fisch
Danqi Chen
243
1,930
0
31 Dec 2020
ERNIE-Doc: A Retrospective Long-Document Modeling Transformer
ERNIE-Doc: A Retrospective Long-Document Modeling Transformer
Siyu Ding
Junyuan Shang
Shuohuan Wang
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
73
53
0
31 Dec 2020
CoCoLM: COmplex COmmonsense Enhanced Language Model with Discourse
  Relations
CoCoLM: COmplex COmmonsense Enhanced Language Model with Discourse Relations
Changlong Yu
Hongming Zhang
Yangqiu Song
Wilfred Ng
80
21
0
31 Dec 2020
TexSmart: A Text Understanding System for Fine-Grained NER and Enhanced
  Semantic Analysis
TexSmart: A Text Understanding System for Fine-Grained NER and Enhanced Semantic Analysis
Haisong Zhang
Lemao Liu
Haiyun Jiang
Yangming Li
Enbo Zhao
...
Tao Yang
Dong Yu
Feng Zhang
Zhanhui Kang
Shuming Shi
40
24
0
31 Dec 2020
XLM-T: Scaling up Multilingual Machine Translation with Pretrained
  Cross-lingual Transformer Encoders
XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders
Shuming Ma
Jian Yang
Haoyang Huang
Zewen Chi
Li Dong
...
Akiko Eriguchi
Saksham Singhal
Xia Song
Arul Menezes
Furu Wei
LRM
26
33
0
31 Dec 2020
AraGPT2: Pre-Trained Transformer for Arabic Language Generation
AraGPT2: Pre-Trained Transformer for Arabic Language Generation
Wissam Antoun
Fady Baly
Hazem M. Hajj
VLM
27
104
0
31 Dec 2020
AraELECTRA: Pre-Training Text Discriminators for Arabic Language
  Understanding
AraELECTRA: Pre-Training Text Discriminators for Arabic Language Understanding
Wissam Antoun
Fady Baly
Hazem M. Hajj
27
104
0
31 Dec 2020
Towards Zero-Shot Knowledge Distillation for Natural Language Processing
Towards Zero-Shot Knowledge Distillation for Natural Language Processing
Ahmad Rashid
Vasileios Lioutas
Abbas Ghaddar
Mehdi Rezagholizadeh
23
27
0
31 Dec 2020
CLEAR: Contrastive Learning for Sentence Representation
CLEAR: Contrastive Learning for Sentence Representation
Zhuofeng Wu
Sinong Wang
Jiatao Gu
Madian Khabsa
Fei Sun
Hao Ma
SSL
33
320
0
31 Dec 2020
Optimizing Deeper Transformers on Small Datasets
Optimizing Deeper Transformers on Small Datasets
Peng Xu
Dhruv Kumar
Wei Yang
Wenjie Zi
Keyi Tang
Chenyang Huang
Jackie C.K. Cheung
S. Prince
Yanshuai Cao
AI4CE
24
69
0
30 Dec 2020
DynaSent: A Dynamic Benchmark for Sentiment Analysis
DynaSent: A Dynamic Benchmark for Sentiment Analysis
Christopher Potts
Zhengxuan Wu
Atticus Geiger
Douwe Kiela
230
77
0
30 Dec 2020
Out of Order: How Important Is The Sequential Order of Words in a
  Sentence in Natural Language Understanding Tasks?
Out of Order: How Important Is The Sequential Order of Words in a Sentence in Natural Language Understanding Tasks?
Thang M. Pham
Trung Bui
Long Mai
Anh Totti Nguyen
220
122
0
30 Dec 2020
Improving BERT with Syntax-aware Local Attention
Improving BERT with Syntax-aware Local Attention
Zhongli Li
Qingyu Zhou
Chao Li
Ke Xu
Yunbo Cao
63
44
0
30 Dec 2020
Reservoir Transformers
Reservoir Transformers
Sheng Shen
Alexei Baevski
Ari S. Morcos
Kurt Keutzer
Michael Auli
Douwe Kiela
35
17
0
30 Dec 2020
ERICA: Improving Entity and Relation Understanding for Pre-trained
  Language Models via Contrastive Learning
ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning
Yujia Qin
Yankai Lin
Ryuichi Takanobu
Zhiyuan Liu
Peng Li
Heng Ji
Minlie Huang
Maosong Sun
Jie Zhou
60
125
0
30 Dec 2020
Code Summarization with Structure-induced Transformer
Code Summarization with Structure-induced Transformer
Hongqiu Wu
Hai Zhao
Min Zhang
41
84
0
29 Dec 2020
CascadeBERT: Accelerating Inference of Pre-trained Language Models via
  Calibrated Complete Models Cascade
CascadeBERT: Accelerating Inference of Pre-trained Language Models via Calibrated Complete Models Cascade
Lei Li
Yankai Lin
Deli Chen
Shuhuai Ren
Peng Li
Jie Zhou
Xu Sun
29
51
0
29 Dec 2020
RADDLE: An Evaluation Benchmark and Analysis Platform for Robust
  Task-oriented Dialog Systems
RADDLE: An Evaluation Benchmark and Analysis Platform for Robust Task-oriented Dialog Systems
Baolin Peng
Chunyuan Li
Zhu Zhang
Chenguang Zhu
Jinchao Li
Jianfeng Gao
21
49
0
29 Dec 2020
Universal Sentence Representation Learning with Conditional Masked
  Language Model
Universal Sentence Representation Learning with Conditional Masked Language Model
Ziyi Yang
Yinfei Yang
Daniel Cer
Jax Law
Eric F. Darve
SSL
24
57
0
28 Dec 2020
DeepHateExplainer: Explainable Hate Speech Detection in Under-resourced
  Bengali Language
DeepHateExplainer: Explainable Hate Speech Detection in Under-resourced Bengali Language
Md. Rezaul Karim
Sumon Dey
Tanhim Islam
Sagor Sarker
Mehadi Hasan Menon
Kabir Hossain
Bharathi Raja Chakravarthi
Md. Azam Hossain
Stefan Decker
30
77
0
28 Dec 2020
Explaining NLP Models via Minimal Contrastive Editing (MiCE)
Explaining NLP Models via Minimal Contrastive Editing (MiCE)
Alexis Ross
Ana Marasović
Matthew E. Peters
38
121
0
27 Dec 2020
Previous
123...828384...949596
Next