ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXiv (abs)PDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 10,878 papers shown
Title
A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models
A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models
Yuanxin Liu
Fandong Meng
Zheng Lin
JiangNan Li
Peng Fu
Yanan Cao
Weiping Wang
Jie Zhou
95
6
0
11 Oct 2022
Mixed-modality Representation Learning and Pre-training for Joint
  Table-and-Text Retrieval in OpenQA
Mixed-modality Representation Learning and Pre-training for Joint Table-and-Text Retrieval in OpenQA
Junjie Huang
Wanjun Zhong
Qianchu Liu
Ming Gong
Daxin Jiang
Nan Duan
VLMRALMLMTD
183
14
0
11 Oct 2022
Task-Aware Specialization for Efficient and Robust Dense Retrieval for
  Open-Domain Question Answering
Task-Aware Specialization for Efficient and Robust Dense Retrieval for Open-Domain Question Answering
Hao Cheng
Hao Fang
Xiaodong Liu
Jianfeng Gao
RALM
81
6
0
11 Oct 2022
Mixture of Attention Heads: Selecting Attention Heads Per Token
Mixture of Attention Heads: Selecting Attention Heads Per Token
Xiaofeng Zhang
Songlin Yang
Zeyu Huang
Jie Zhou
Wenge Rong
Zhang Xiong
MoE
176
48
0
11 Oct 2022
Pre-Training Representations of Binary Code Using Contrastive Learning
Pre-Training Representations of Binary Code Using Contrastive Learning
Yifan Zhang
Chen Huang
Yueke Zhang
Kevin Cao
Scott Thomas Andersen
Huajie Shao
Kevin Leach
Yu Huang
104
2
0
11 Oct 2022
Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling
Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling
Haw-Shiuan Chang
Ruei-Yao Sun
Kathryn Ricci
Andrew McCallum
112
15
0
10 Oct 2022
Not All Errors are Equal: Learning Text Generation Metrics using
  Stratified Error Synthesis
Not All Errors are Equal: Learning Text Generation Metrics using Stratified Error Synthesis
Wenda Xu
Yi-Lin Tuan
Yujie Lu
Michael Stephen Saxon
Lei Li
William Yang Wang
118
22
0
10 Oct 2022
Multilingual Representation Distillation with Contrastive Learning
Multilingual Representation Distillation with Contrastive Learning
Weiting Tan
Kevin Heffernan
Holger Schwenk
Philipp Koehn
90
16
0
10 Oct 2022
Extracting or Guessing? Improving Faithfulness of Event Temporal
  Relation Extraction
Extracting or Guessing? Improving Faithfulness of Event Temporal Relation Extraction
Haoyu Wang
Hongming Zhang
Yuqian Deng
Jacob R. Gardner
Dan Roth
Muhao Chen
68
21
0
10 Oct 2022
Hierarchical3D Adapters for Long Video-to-text Summarization
Hierarchical3D Adapters for Long Video-to-text Summarization
Pinelopi Papalampidi
Mirella Lapata
VGen
101
13
0
10 Oct 2022
Uncertainty Quantification with Pre-trained Language Models: A
  Large-Scale Empirical Analysis
Uncertainty Quantification with Pre-trained Language Models: A Large-Scale Empirical Analysis
Yuxin Xiao
Paul Pu Liang
Umang Bhatt
Willie Neiswanger
Ruslan Salakhutdinov
Louis-Philippe Morency
286
99
0
10 Oct 2022
Empowering the Fact-checkers! Automatic Identification of Claim Spans on
  Twitter
Empowering the Fact-checkers! Automatic Identification of Claim Spans on Twitter
Megha Sundriyal
Atharva Kulkarni
Vaibhav Pulastya
Md. Shad Akhtar
Tanmoy Chakraborty
MedIm
71
19
0
10 Oct 2022
DEPTWEET: A Typology for Social Media Texts to Detect Depression
  Severities
DEPTWEET: A Typology for Social Media Texts to Detect Depression Severities
Mohsinul Kabir
Tasnim Ahmed
Md. Bakhtiar Hasan
Md Tahmid Rahman Laskar
Tarun Kumar Joarder
H. Mahmud
Kamrul Hasan
94
53
0
10 Oct 2022
Parameter-Efficient Tuning with Special Token Adaptation
Parameter-Efficient Tuning with Special Token Adaptation
Xiaoocong Yang
James Y. Huang
Wenxuan Zhou
Muhao Chen
99
12
0
10 Oct 2022
Quantifying Social Biases Using Templates is Unreliable
Quantifying Social Biases Using Templates is Unreliable
P. Seshadri
Pouya Pezeshkpour
Sameer Singh
95
34
0
09 Oct 2022
QAScore -- An Unsupervised Unreferenced Metric for the Question
  Generation Evaluation
QAScore -- An Unsupervised Unreferenced Metric for the Question Generation Evaluation
Tianbo Ji
Chenyang Lyu
Gareth J. F. Jones
Liting Zhou
Yvette Graham
64
21
0
09 Oct 2022
KSAT: Knowledge-infused Self Attention Transformer -- Integrating
  Multiple Domain-Specific Contexts
KSAT: Knowledge-infused Self Attention Transformer -- Integrating Multiple Domain-Specific Contexts
Kaushik Roy
Yuxin Zi
Vignesh Narayanan
Manas Gaur
Amit P. Sheth
AI4MH
82
12
0
09 Oct 2022
SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency
  of Adapters
SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters
Shwai He
Liang Ding
Daize Dong
Miao Zhang
Dacheng Tao
MoE
139
91
0
09 Oct 2022
Spread Love Not Hate: Undermining the Importance of Hateful Pre-training
  for Hate Speech Detection
Spread Love Not Hate: Undermining the Importance of Hateful Pre-training for Hate Speech Detection
Omkar Gokhale
Aditya Kane
Shantanu Patankar
Tanmay Chavan
Raviraj Joshi
VLM
96
7
0
09 Oct 2022
Noise-Robust De-Duplication at Scale
Noise-Robust De-Duplication at Scale
Emily Silcock
Luca DÁmico-Wong
Jinglin Yang
Melissa Dell
SyDa
87
20
0
09 Oct 2022
Understanding and Improving Zero-shot Multi-hop Reasoning in Generative
  Question Answering
Understanding and Improving Zero-shot Multi-hop Reasoning in Generative Question Answering
Zhengbao Jiang
Jun Araki
Haibo Ding
Graham Neubig
LRM
81
11
0
09 Oct 2022
Deep Span Representations for Named Entity Recognition
Deep Span Representations for Named Entity Recognition
Enwei Zhu
Yiyang Liu
Jinpeng Li
80
11
0
09 Oct 2022
KALM: Knowledge-Aware Integration of Local, Document, and Global
  Contexts for Long Document Understanding
KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document Understanding
Shangbin Feng
Zhaoxuan Tan
Wenqian Zhang
Zhenyu Lei
Yulia Tsvetkov
KELMVLM
108
10
0
08 Oct 2022
On Task-Adaptive Pretraining for Dialogue Response Selection
On Task-Adaptive Pretraining for Dialogue Response Selection
Tzu-Hsiang Lin
Ta-Chung Chi
Anna Rumshisky
60
1
0
08 Oct 2022
Generative Language Models for Paragraph-Level Question Generation
Generative Language Models for Paragraph-Level Question Generation
Asahi Ushio
Fernando Alva-Manchego
Jose Camacho-Collados
ELM
63
48
0
08 Oct 2022
ConstGCN: Constrained Transmission-based Graph Convolutional Networks
  for Document-level Relation Extraction
ConstGCN: Constrained Transmission-based Graph Convolutional Networks for Document-level Relation Extraction
Ji Qi
Bin Xu
Kaisheng Zeng
Jinxin Liu
Jifan Yu
Qifang Gao
Juanzi Li
Lei Hou
GNN
83
1
0
08 Oct 2022
Short Text Pre-training with Extended Token Classification for
  E-commerce Query Understanding
Short Text Pre-training with Extended Token Classification for E-commerce Query Understanding
Haoming Jiang
Tianyu Cao
Zheng Li
Cheng-hsin Luo
Xianfeng Tang
Qingyu Yin
Danqing Zhang
R. Goutam
Bing Yin
RALM
74
12
0
08 Oct 2022
ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational
  Finance Question Answering
ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering
Zhiyu Zoey Chen
Shiyang Li
Charese Smiley
Zhiqiang Ma
Sameena Shah
William Yang Wang
AIMatLRMAI4CE
152
116
0
07 Oct 2022
See, Plan, Predict: Language-guided Cognitive Planning with Video
  Prediction
See, Plan, Predict: Language-guided Cognitive Planning with Video Prediction
Maria Attarian
Advaya Gupta
Ziyi Zhou
Wei Yu
Igor Gilitschenski
Animesh Garg
LM&Ro
79
8
0
07 Oct 2022
Named Entity Recognition in Twitter: A Dataset and Analysis on
  Short-Term Temporal Shifts
Named Entity Recognition in Twitter: A Dataset and Analysis on Short-Term Temporal Shifts
Asahi Ushio
Leonardo Neves
Vítor Silva
Francesco Barbieri
Jose Camacho-Collados
100
30
0
07 Oct 2022
Artificial Intelligence and Natural Language Processing and
  Understanding in Space: A Methodological Framework and Four ESA Case Studies
Artificial Intelligence and Natural Language Processing and Understanding in Space: A Methodological Framework and Four ESA Case Studies
José Manuél Gómez-Pérez
Andrés García-Silva
R. Leone
M. Albani
Moritz Fontaine
C. Poncet
L. Summerer
A. Donati
Ilaria Roma
Stefano Scaglioni
71
1
0
07 Oct 2022
Understanding Transformer Memorization Recall Through Idioms
Understanding Transformer Memorization Recall Through Idioms
Adi Haviv
Ido Cohen
Jacob Gidron
R. Schuster
Yoav Goldberg
Mor Geva
112
53
0
07 Oct 2022
Are Representations Built from the Ground Up? An Empirical Examination
  of Local Composition in Language Models
Are Representations Built from the Ground Up? An Empirical Examination of Local Composition in Language Models
Emmy Liu
Graham Neubig
CoGe
62
11
0
07 Oct 2022
How Large Language Models are Transforming Machine-Paraphrased
  Plagiarism
How Large Language Models are Transforming Machine-Paraphrased Plagiarism
Jan Philip Wahle
Terry Ruas
Frederic Kirstein
Bela Gipp
77
35
0
07 Oct 2022
DABERT: Dual Attention Enhanced BERT for Semantic Matching
DABERT: Dual Attention Enhanced BERT for Semantic Matching
Sirui Wang
Di Liang
Jian Song
Yun Li
Wei Wu
93
18
0
07 Oct 2022
SpaceQA: Answering Questions about the Design of Space Missions and
  Space Craft Concepts
SpaceQA: Answering Questions about the Design of Space Missions and Space Craft Concepts
Andrés García-Silva
Cristian Berrío
José Manuél Gómez-Pérez
J. Martínez-Heras
A. Donati
Ilaria Roma
57
6
0
07 Oct 2022
Event Extraction: A Survey
Event Extraction: A Survey
Viet Dac Lai
103
9
0
07 Oct 2022
Pre-trained Adversarial Perturbations
Pre-trained Adversarial Perturbations
Y. Ban
Yinpeng Dong
AAML
108
24
0
07 Oct 2022
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language
  Understanding
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Kenton Lee
Mandar Joshi
Iulia Turc
Hexiang Hu
Fangyu Liu
Julian Martin Eisenschlos
Urvashi Khandelwal
Peter Shaw
Ming-Wei Chang
Kristina Toutanova
CLIPVLM
310
280
0
07 Oct 2022
A Unified Framework for Multi-intent Spoken Language Understanding with
  prompting
A Unified Framework for Multi-intent Spoken Language Understanding with prompting
Feifan Song
Lianzhe Huang
Houfeng Wang
56
3
0
07 Oct 2022
CAT-probing: A Metric-based Approach to Interpret How Pre-trained Models
  for Programming Language Attend Code Structure
CAT-probing: A Metric-based Approach to Interpret How Pre-trained Models for Programming Language Attend Code Structure
Nuo Chen
Qiushi Sun
Renyu Zhu
Xiang Li
Xuesong Lu
Ming Gao
80
10
0
07 Oct 2022
Not another Negation Benchmark: The NaN-NLI Test Suite for Sub-clausal
  Negation
Not another Negation Benchmark: The NaN-NLI Test Suite for Sub-clausal Negation
Thinh Hung Truong
Yulia Otmakhova
Tim Baldwin
Trevor Cohn
Jey Han Lau
Karin Verspoor
120
24
0
06 Oct 2022
Unsupervised Domain Adaptation for COVID-19 Information Service with
  Contrastive Adversarial Domain Mixup
Unsupervised Domain Adaptation for COVID-19 Information Service with Contrastive Adversarial Domain Mixup
Huimin Zeng
Zhenrui Yue
Ziyi Kou
Lanyu Shang
Yang Zhang
Dong Wang
SSL
74
6
0
06 Oct 2022
Improving Large-scale Paraphrase Acquisition and Generation
Improving Large-scale Paraphrase Acquisition and Generation
Yao Dou
Chao Jiang
Wei Xu
99
9
0
06 Oct 2022
FAST: Improving Controllability for Text Generation with Feedback Aware
  Self-Training
FAST: Improving Controllability for Text Generation with Feedback Aware Self-Training
Junyi Chai
Reid Pryzant
Victor Ye Dong
Konstantin Golobokov
Chenguang Zhu
Yi Liu
95
5
0
06 Oct 2022
Adaptive Ranking-based Sample Selection for Weakly Supervised
  Class-imbalanced Text Classification
Adaptive Ranking-based Sample Selection for Weakly Supervised Class-imbalanced Text Classification
Linxin Song
Jieyu Zhang
Tianxiang Yang
M. Goto
77
4
0
06 Oct 2022
Explainable Verbal Deception Detection using Transformers
Explainable Verbal Deception Detection using Transformers
Loukas Ilias
Felix Soldner
Bennett Kleinberg
24
5
0
06 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
302
100
0
06 Oct 2022
Detecting Narrative Elements in Informational Text
Detecting Narrative Elements in Informational Text
Effi Levi
Guy Mor
Tamir Sheafer
Shaul R. Shenhav
439
7
0
06 Oct 2022
BootAug: Boosting Text Augmentation via Hybrid Instance Filtering
  Framework
BootAug: Boosting Text Augmentation via Hybrid Instance Filtering Framework
Heng Yang
Ke Li
107
6
0
06 Oct 2022
Previous
123...136137138...216217218
Next