ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXivPDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 4,880 papers shown
Title
A Bayesian Framework for Information-Theoretic Probing
A Bayesian Framework for Information-Theoretic Probing
Tiago Pimentel
Ryan Cotterell
33
24
0
08 Sep 2021
Highly Parallel Autoregressive Entity Linking with Discriminative
  Correction
Highly Parallel Autoregressive Entity Linking with Discriminative Correction
Nicola De Cao
Wilker Aziz
Ivan Titov
60
36
0
08 Sep 2021
Biomedical and Clinical Language Models for Spanish: On the Benefits of
  Domain-Specific Pretraining in a Mid-Resource Scenario
Biomedical and Clinical Language Models for Spanish: On the Benefits of Domain-Specific Pretraining in a Mid-Resource Scenario
C. Carrino
Jordi Armengol-Estapé
Asier Gutiérrez-Fandiño
Joan Llop-Palao
Marc Pàmies
Aitor Gonzalez-Agirre
Marta Villegas
21
44
0
08 Sep 2021
NSP-BERT: A Prompt-based Few-Shot Learner Through an Original
  Pre-training Task--Next Sentence Prediction
NSP-BERT: A Prompt-based Few-Shot Learner Through an Original Pre-training Task--Next Sentence Prediction
Yi Sun
Yu Zheng
Chao Hao
Hangping Qiu
VLM
46
37
0
08 Sep 2021
On the Transferability of Pre-trained Language Models: A Study from
  Artificial Datasets
On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets
Cheng-Han Chiang
Hung-yi Lee
SyDa
40
25
0
08 Sep 2021
ArchivalQA: A Large-scale Benchmark Dataset for Open Domain Question
  Answering over Historical News Collections
ArchivalQA: A Large-scale Benchmark Dataset for Open Domain Question Answering over Historical News Collections
Jiexin Wang
Adam Jatowt
Masatoshi Yoshikawa
40
33
0
08 Sep 2021
Self-supervised Contrastive Cross-Modality Representation Learning for
  Spoken Question Answering
Self-supervised Contrastive Cross-Modality Representation Learning for Spoken Question Answering
Chenyu You
Nuo Chen
Yuexian Zou
SSL
27
63
0
08 Sep 2021
On the Challenges of Evaluating Compositional Explanations in Multi-Hop
  Inference: Relevance, Completeness, and Expert Ratings
On the Challenges of Evaluating Compositional Explanations in Multi-Hop Inference: Relevance, Completeness, and Expert Ratings
Peter Alexander Jansen
Kelly Smith
Dan Moreno
Huitzilin Ortiz
CoGe
ReLM
LRM
38
10
0
07 Sep 2021
Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT
  Compression
Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression
Canwen Xu
Wangchunshu Zhou
Tao Ge
Kelvin J. Xu
Julian McAuley
Furu Wei
21
41
0
07 Sep 2021
NumGPT: Improving Numeracy Ability of Generative Pre-trained Models
NumGPT: Improving Numeracy Ability of Generative Pre-trained Models
Zhihua Jin
Xin Jiang
Xingbo Wang
Qun Liu
Yong Wang
Xiaozhe Ren
Huamin Qu
19
19
0
07 Sep 2021
GOLD: Improving Out-of-Scope Detection in Dialogues using Data
  Augmentation
GOLD: Improving Out-of-Scope Detection in Dialogues using Data Augmentation
Derek Chen
Zhou Yu
32
31
0
07 Sep 2021
Sequential Attention Module for Natural Language Processing
Sequential Attention Module for Natural Language Processing
Mengyuan Zhou
Jian Ma
Haiqing Yang
Lian-Xin Jiang
Yang Mo
AI4TS
27
2
0
07 Sep 2021
FuseFormer: Fusing Fine-Grained Information in Transformers for Video
  Inpainting
FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting
R. Liu
Hanming Deng
Yangyi Huang
Xiaoyu Shi
Lewei Lu
Wenxiu Sun
Xiaogang Wang
Jifeng Dai
Hongsheng Li
ViT
30
124
0
07 Sep 2021
Paraphrase Generation as Unsupervised Machine Translation
Paraphrase Generation as Unsupervised Machine Translation
Xiaofei Sun
Yufei Tian
Yuxian Meng
Nanyun Peng
Fei Wu
Jiwei Li
Chun Fan
LRM
27
5
0
07 Sep 2021
Detecting Inspiring Content on Social Media
Detecting Inspiring Content on Social Media
Oana Ignat
Y-Lan Boureau
Jane A. Yu
A. Halevy
24
6
0
06 Sep 2021
SS-BERT: Mitigating Identity Terms Bias in Toxic Comment Classification
  by Utilising the Notion of "Subjectivity" and "Identity Terms"
SS-BERT: Mitigating Identity Terms Bias in Toxic Comment Classification by Utilising the Notion of "Subjectivity" and "Identity Terms"
Zhixue Zhao
Ziqi Zhang
F. Hopfgartner
26
5
0
06 Sep 2021
Enhancing Natural Language Representation with Large-Scale Out-of-Domain
  Commonsense
Enhancing Natural Language Representation with Large-Scale Out-of-Domain Commonsense
Wanyun Cui
Xingran Chen
22
6
0
06 Sep 2021
Proto: A Neural Cocktail for Generating Appealing Conversations
Proto: A Neural Cocktail for Generating Appealing Conversations
Sougata Saha
Souvik Das
Elizabeth Soper
Erin Pacquetet
Rohini Srihari
26
12
0
06 Sep 2021
Vision Guided Generative Pre-trained Language Models for Multimodal
  Abstractive Summarization
Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization
Tiezheng Yu
Wenliang Dai
Zihan Liu
Pascale Fung
37
73
0
06 Sep 2021
Sent2Span: Span Detection for PICO Extraction in the Biomedical Text
  without Span Annotations
Sent2Span: Span Detection for PICO Extraction in the Biomedical Text without Span Annotations
Shifeng Liu
Yifang Sun
Bing Li
Wei Wang
Florence T. Bourgeois
A. Dunn
19
14
0
06 Sep 2021
End-to-End Self-Debiasing Framework for Robust NLU Training
End-to-End Self-Debiasing Framework for Robust NLU Training
Abbas Ghaddar
Philippe Langlais
Mehdi Rezagholizadeh
Ahmad Rashid
UQCV
34
36
0
05 Sep 2021
FewshotQA: A simple framework for few-shot learning of question
  answering tasks using pre-trained text-to-text models
FewshotQA: A simple framework for few-shot learning of question answering tasks using pre-trained text-to-text models
Rakesh Chada
P. Natarajan
36
45
0
04 Sep 2021
Frustratingly Simple Pretraining Alternatives to Masked Language
  Modeling
Frustratingly Simple Pretraining Alternatives to Masked Language Modeling
Atsuki Yamaguchi
G. Chrysostomou
Katerina Margatina
Nikolaos Aletras
27
25
0
04 Sep 2021
Error Detection in Large-Scale Natural Language Understanding Systems
  Using Transformer Models
Error Detection in Large-Scale Natural Language Understanding Systems Using Transformer Models
Rakesh Chada
P. Natarajan
Darshan Fofadiya
Prathap Ramachandra
33
6
0
04 Sep 2021
CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge
CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge
Yasumasa Onoe
Michael J.Q. Zhang
Eunsol Choi
Greg Durrett
HILM
42
85
0
03 Sep 2021
Do Prompt-Based Models Really Understand the Meaning of their Prompts?
Do Prompt-Based Models Really Understand the Meaning of their Prompts?
Albert Webson
Ellie Pavlick
LRM
64
355
0
02 Sep 2021
So Cloze yet so Far: N400 Amplitude is Better Predicted by
  Distributional Information than Human Predictability Judgements
So Cloze yet so Far: N400 Amplitude is Better Predicted by Distributional Information than Human Predictability Judgements
J. Michaelov
S. Coulson
Benjamin Bergen
24
44
0
02 Sep 2021
MultiEURLEX -- A multi-lingual and multi-label legal document
  classification dataset for zero-shot cross-lingual transfer
MultiEURLEX -- A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer
Ilias Chalkidis
Manos Fergadiotis
Ion Androutsopoulos
AILaw
32
108
0
02 Sep 2021
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for
  Code Understanding and Generation
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
Yue Wang
Weishi Wang
Chenyu You
Guosheng Lin
252
1,515
0
02 Sep 2021
Imposing Relation Structure in Language-Model Embeddings Using
  Contrastive Learning
Imposing Relation Structure in Language-Model Embeddings Using Contrastive Learning
Christos Theodoropoulos
James Henderson
Andrei Catalin Coman
Marie-Francine Moens
27
15
0
02 Sep 2021
MWPToolkit: An Open-Source Framework for Deep Learning-Based Math Word
  Problem Solvers
MWPToolkit: An Open-Source Framework for Deep Learning-Based Math Word Problem Solvers
Yihuai Lan
Lei Wang
Qiyuan Zhang
Yunshi Lan
B. Dai
Yan Wang
Dongxiang Zhang
Ee-Peng Lim
AIMat
30
72
0
02 Sep 2021
Causal Inference in Natural Language Processing: Estimation, Prediction,
  Interpretation and Beyond
Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond
Amir Feder
Katherine A. Keith
Emaad A. Manzoor
Reid Pryzant
Dhanya Sridhar
...
Roi Reichart
Margaret E. Roberts
Brandon M Stewart
Victor Veitch
Diyi Yang
CML
46
235
0
02 Sep 2021
CTAL: Pre-training Cross-modal Transformer for Audio-and-Language
  Representations
CTAL: Pre-training Cross-modal Transformer for Audio-and-Language Representations
Hang Li
Yunxing Kang
Tianqiao Liu
Wenbiao Ding
Zitao Liu
41
17
0
01 Sep 2021
FinQA: A Dataset of Numerical Reasoning over Financial Data
FinQA: A Dataset of Numerical Reasoning over Financial Data
Zhiyu Chen
Wenhu Chen
Charese Smiley
Sameena Shah
Iana Borova
...
Reema N Moussa
Matthew I. Beane
Ting-Hao 'Kenneth' Huang
Bryan R. Routledge
Wenjie Wang
AIMat
42
302
0
01 Sep 2021
It's not Rocket Science : Interpreting Figurative Language in Narratives
It's not Rocket Science : Interpreting Figurative Language in Narratives
Tuhin Chakrabarty
Yejin Choi
Vered Shwartz
24
55
0
31 Aug 2021
Interactive Machine Comprehension with Dynamic Knowledge Graphs
Interactive Machine Comprehension with Dynamic Knowledge Graphs
Xingdi Yuan
36
3
0
31 Aug 2021
Sentence Bottleneck Autoencoders from Transformer Language Models
Sentence Bottleneck Autoencoders from Transformer Language Models
Ivan Montero
Nikolaos Pappas
Noah A. Smith
AI4CE
25
28
0
31 Aug 2021
Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning
Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning
Linyang Li
Demin Song
Xiaonan Li
Jiehang Zeng
Ruotian Ma
Xipeng Qiu
33
136
0
31 Aug 2021
Enjoy the Salience: Towards Better Transformer-based Faithful
  Explanations with Word Salience
Enjoy the Salience: Towards Better Transformer-based Faithful Explanations with Word Salience
G. Chrysostomou
Nikolaos Aletras
37
16
0
31 Aug 2021
Plan-then-Generate: Controlled Data-to-Text Generation via Planning
Plan-then-Generate: Controlled Data-to-Text Generation via Planning
Yixuan Su
David Vandyke
Sihui Wang
Yimai Fang
Nigel Collier
36
80
0
31 Aug 2021
Discretized Integrated Gradients for Explaining Language Models
Discretized Integrated Gradients for Explaining Language Models
Soumya Sanyal
Xiang Ren
FAtt
22
53
0
31 Aug 2021
T3-Vis: a visual analytic framework for Training and fine-Tuning
  Transformers in NLP
T3-Vis: a visual analytic framework for Training and fine-Tuning Transformers in NLP
Raymond Li
Wen Xiao
Lanjun Wang
Hyeju Jang
Giuseppe Carenini
ViT
31
23
0
31 Aug 2021
Semi-Supervised Exaggeration Detection of Health Science Press Releases
Semi-Supervised Exaggeration Detection of Health Science Press Releases
Dustin Wright
Isabelle Augenstein
43
12
0
30 Aug 2021
N24News: A New Dataset for Multimodal News Classification
N24News: A New Dataset for Multimodal News Classification
Zhen Wang
Xu Shan
Xiangxie Zhang
Jie Yang
VLM
31
34
0
30 Aug 2021
Generating Answer Candidates for Quizzes and Answer-Aware Question
  Generators
Generating Answer Candidates for Quizzes and Answer-Aware Question Generators
Kristiyan Vachev
Momchil Hardalov
Georgi Karadzhov
Georgi Georgiev
Ivan Koychev
Preslav Nakov
AI4Ed
31
5
0
29 Aug 2021
Student Surpasses Teacher: Imitation Attack for Black-Box NLP APIs
Student Surpasses Teacher: Imitation Attack for Black-Box NLP APIs
Qiongkai Xu
Xuanli He
Lingjuan Lyu
Lizhen Qu
Gholamreza Haffari
MLAU
47
22
0
29 Aug 2021
Sentence Structure and Word Relationship Modeling for Emphasis Selection
Sentence Structure and Word Relationship Modeling for Emphasis Selection
Haoran Yang
Wai Lam
18
0
0
29 Aug 2021
WALNUT: A Benchmark on Semi-weakly Supervised Learning for Natural
  Language Understanding
WALNUT: A Benchmark on Semi-weakly Supervised Learning for Natural Language Understanding
Guoqing Zheng
Giannis Karamanolakis
Kai Shu
Ahmed Hassan Awadallah
SSL
21
1
0
28 Aug 2021
Layer-wise Model Pruning based on Mutual Information
Layer-wise Model Pruning based on Mutual Information
Chun Fan
Jiwei Li
Xiang Ao
Fei Wu
Yuxian Meng
Xiaofei Sun
53
19
0
28 Aug 2021
Self-training Improves Pre-training for Few-shot Learning in
  Task-oriented Dialog Systems
Self-training Improves Pre-training for Few-shot Learning in Task-oriented Dialog Systems
Fei Mi
Wanhao Zhou
Feng Cai
Lingjing Kong
Minlie Huang
Boi Faltings
29
32
0
28 Aug 2021
Previous
123...737475...969798
Next