ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXiv (abs)PDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 10,707 papers shown
Title
Logic-Guided Data Augmentation and Regularization for Consistent
  Question Answering
Logic-Guided Data Augmentation and Regularization for Consistent Question Answering
Akari Asai
Hannaneh Hajishirzi
NAI
104
117
0
21 Apr 2020
The Ivory Tower Lost: How College Students Respond Differently than the
  General Public to the COVID-19 Pandemic
The Ivory Tower Lost: How College Students Respond Differently than the General Public to the COVID-19 Pandemic
Viet-An Duong
Phu Pham
Tongyu Yang
Yu Wang
Jiebo Luo
AI4CE
43
94
0
21 Apr 2020
Train No Evil: Selective Masking for Task-Guided Pre-Training
Train No Evil: Selective Masking for Task-Guided Pre-Training
Yuxian Gu
Zhengyan Zhang
Xiaozhi Wang
Zhiyuan Liu
Maosong Sun
141
59
0
21 Apr 2020
StereoSet: Measuring stereotypical bias in pretrained language models
StereoSet: Measuring stereotypical bias in pretrained language models
Moin Nadeem
Anna Bethke
Siva Reddy
103
1,026
0
20 Apr 2020
MPNet: Masked and Permuted Pre-training for Language Understanding
MPNet: Masked and Permuted Pre-training for Language Understanding
Kaitao Song
Xu Tan
Tao Qin
Jianfeng Lu
Tie-Yan Liu
111
1,142
0
20 Apr 2020
Adversarial Training for Large Neural Language Models
Adversarial Training for Large Neural Language Models
Xiaodong Liu
Hao Cheng
Pengcheng He
Weizhu Chen
Yu Wang
Hoifung Poon
Jianfeng Gao
AAML
83
186
0
20 Apr 2020
Fine-tuning Multi-hop Question Answering with Hierarchical Graph Network
Guanming Xiong
123
0
0
20 Apr 2020
Extractive Summarization as Text Matching
Extractive Summarization as Text Matching
Ming Zhong
Pengfei Liu
Yiran Chen
Danqing Wang
Xipeng Qiu
Xuanjing Huang
150
462
0
19 Apr 2020
Are we pretraining it right? Digging deeper into visio-linguistic
  pretraining
Are we pretraining it right? Digging deeper into visio-linguistic pretraining
Amanpreet Singh
Vedanuj Goswami
Devi Parikh
VLM
78
48
0
19 Apr 2020
SimAlign: High Quality Word Alignments without Parallel Training Data
  using Static and Contextualized Embeddings
SimAlign: High Quality Word Alignments without Parallel Training Data using Static and Contextualized Embeddings
Masoud Jalili Sabet
Philipp Dufter
François Yvon
Hinrich Schütze
113
238
0
18 Apr 2020
ETC: Encoding Long and Structured Inputs in Transformers
ETC: Encoding Long and Structured Inputs in Transformers
Joshua Ainslie
Santiago Ontanon
Chris Alberti
Vaclav Cvicek
Zachary Kenneth Fisher
Philip Pham
Anirudh Ravula
Sumit Sanghai
Qifan Wang
Li Yang
75
55
0
17 Apr 2020
Augmented Curation of Unstructured Clinical Notes from a Massive EHR
  System Reveals Specific Phenotypic Signature of Impending COVID-19 Diagnosis
Augmented Curation of Unstructured Clinical Notes from a Massive EHR System Reveals Specific Phenotypic Signature of Impending COVID-19 Diagnosis
F. Shweta
K. Murugadoss
S. Awasthi
A. Venkatakrishnan
Arjun Puranik
...
G. Gores
A. Williams
J. Halamka
V. Soundararajan
A. Badley
53
26
0
17 Apr 2020
Transform and Tell: Entity-Aware News Image Captioning
Transform and Tell: Entity-Aware News Image Captioning
Alasdair Tran
A. Mathews
Lexing Xie
VLM
60
97
0
17 Apr 2020
A Survey of Document Grounded Dialogue Systems (DGDS)
A Survey of Document Grounded Dialogue Systems (DGDS)
Longxuan Ma
Weinan Zhang
Mingda Li
Ting Liu
75
19
0
17 Apr 2020
Training with Quantization Noise for Extreme Model Compression
Training with Quantization Noise for Extreme Model Compression
Angela Fan
Pierre Stock
Benjamin Graham
Edouard Grave
Remi Gribonval
Hervé Jégou
Armand Joulin
MQ
111
246
0
15 Apr 2020
SPECTER: Document-level Representation Learning using Citation-informed
  Transformers
SPECTER: Document-level Representation Learning using Citation-informed Transformers
Arman Cohan
Sergey Feldman
Iz Beltagy
Doug Downey
Daniel S. Weld
AI4TS
126
561
0
15 Apr 2020
TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented
  Dialogue
TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue
Chien-Sheng Wu
Guosheng Lin
R. Socher
Caiming Xiong
104
324
0
15 Apr 2020
Coreferential Reasoning Learning for Language Representation
Coreferential Reasoning Learning for Language Representation
Deming Ye
Yankai Lin
Jiaju Du
Zhenghao Liu
Peng Li
Maosong Sun
Zhiyuan Liu
87
179
0
15 Apr 2020
A Simple Yet Strong Pipeline for HotpotQA
A Simple Yet Strong Pipeline for HotpotQA
Dirk Groeneveld
Tushar Khot
Mausam
Ashish Sabharwal
73
40
0
14 Apr 2020
Robustly Pre-trained Neural Model for Direct Temporal Relation
  Extraction
Robustly Pre-trained Neural Model for Direct Temporal Relation Extraction
Hong Guan
Jianfu Li
Hua Xu
M. Devarakonda
15
11
0
13 Apr 2020
Pretrained Transformers Improve Out-of-Distribution Robustness
Pretrained Transformers Improve Out-of-Distribution Robustness
Dan Hendrycks
Xiaoyuan Liu
Eric Wallace
Adam Dziedzic
R. Krishnan
Basel Alomair
OOD
216
436
0
13 Apr 2020
CLUE: A Chinese Language Understanding Evaluation Benchmark
CLUE: A Chinese Language Understanding Evaluation Benchmark
Liang Xu
Hai Hu
Xuanwei Zhang
Lu Li
Chenjie Cao
...
Cong Yue
Xinrui Zhang
Zhen-Yi Yang
Kyle Richardson
Zhenzhong Lan
ELM
110
388
0
13 Apr 2020
Frequency-Guided Word Substitutions for Detecting Textual Adversarial
  Examples
Frequency-Guided Word Substitutions for Detecting Textual Adversarial Examples
Maximilian Mozes
Pontus Stenetorp
Bennett Kleinberg
Lewis D. Griffin
AAML
185
103
0
13 Apr 2020
From Machine Reading Comprehension to Dialogue State Tracking: Bridging
  the Gap
From Machine Reading Comprehension to Dialogue State Tracking: Bridging the Gap
Shuyang Gao
Sanchit Agarwal
Tagyoung Chung
Di Jin
Dilek Z. Hakkani-Tür
106
71
0
13 Apr 2020
ProFormer: Towards On-Device LSH Projection Based Transformers
ProFormer: Towards On-Device LSH Projection Based Transformers
Chinnadhurai Sankar
Sujith Ravi
Zornitsa Kozareva
67
9
0
13 Apr 2020
Explaining Question Answering Models through Text Generation
Explaining Question Answering Models through Text Generation
Veronica Latcinnik
Jonathan Berant
LRM
96
51
0
12 Apr 2020
Unsupervised Commonsense Question Answering with Self-Talk
Unsupervised Commonsense Question Answering with Self-Talk
Vered Shwartz
Peter West
Ronan Le Bras
Chandra Bhagavatula
Yejin Choi
ReLMSSLAI4MHLRM
72
263
0
11 Apr 2020
Longformer: The Long-Document Transformer
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALMVLM
210
4,109
0
10 Apr 2020
Designing Precise and Robust Dialogue Response Evaluators
Designing Precise and Robust Dialogue Response Evaluators
Tianyu Zhao
Divesh Lala
Tatsuya Kawahara
57
53
0
10 Apr 2020
Translation Artifacts in Cross-lingual Transfer Learning
Translation Artifacts in Cross-lingual Transfer Learning
Mikel Artetxe
Gorka Labaka
Eneko Agirre
65
121
0
09 Apr 2020
BLEURT: Learning Robust Metrics for Text Generation
BLEURT: Learning Robust Metrics for Text Generation
Thibault Sellam
Dipanjan Das
Ankur P. Parikh
134
1,508
0
09 Apr 2020
MuTual: A Dataset for Multi-Turn Dialogue Reasoning
MuTual: A Dataset for Multi-Turn Dialogue Reasoning
Leyang Cui
Yu-Huan Wu
Shujie Liu
Yue Zhang
Ming Zhou
LRM
67
152
0
09 Apr 2020
Injecting Numerical Reasoning Skills into Language Models
Injecting Numerical Reasoning Skills into Language Models
Mor Geva
Ankit Gupta
Jonathan Berant
AIMatLRM
93
227
0
09 Apr 2020
Improving Readability for Automatic Speech Recognition Transcription
Improving Readability for Automatic Speech Recognition Transcription
Junwei Liao
Sefik Emre Eskimez
Liyang Lu
Yu Shi
Ming Gong
Linjun Shou
Hong Qu
Michael Zeng
67
56
0
09 Apr 2020
DynaBERT: Dynamic BERT with Adaptive Width and Depth
DynaBERT: Dynamic BERT with Adaptive Width and Depth
Lu Hou
Zhiqi Huang
Lifeng Shang
Xin Jiang
Xiao Chen
Qun Liu
MQ
91
323
0
08 Apr 2020
Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation
  with Semantic Fidelity
Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity
Hamza Harkous
Isabel Groves
Amir Saffari
86
89
0
08 Apr 2020
Exploring Versatile Generative Language Model Via Parameter-Efficient
  Transfer Learning
Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning
Zhaojiang Lin
Andrea Madotto
Pascale Fung
105
163
0
08 Apr 2020
Improving BERT with Self-Supervised Attention
Improving BERT with Self-Supervised Attention
Yiren Chen
Xiaoyu Kou
Jiangang Bai
Yunhai Tong
30
10
0
08 Apr 2020
CALM: Continuous Adaptive Learning for Language Modeling
CALM: Continuous Adaptive Learning for Language Modeling
Kristjan Arumae
Parminder Bhatia
CLLKELM
29
6
0
08 Apr 2020
Downstream Model Design of Pre-trained Language Model for Relation
  Extraction Task
Downstream Model Design of Pre-trained Language Model for Relation Extraction Task
Cheng-rong Li
Ye Tian
72
36
0
08 Apr 2020
DialBERT: A Hierarchical Pre-Trained Model for Conversation
  Disentanglement
DialBERT: A Hierarchical Pre-Trained Model for Conversation Disentanglement
Tianda Li
Jia-Chen Gu
Xiao-Dan Zhu
Quan Liu
Zhenhua Ling
Zhiming Su
Si Wei
70
28
0
08 Apr 2020
Byte Pair Encoding is Suboptimal for Language Model Pretraining
Byte Pair Encoding is Suboptimal for Language Model Pretraining
Kaj Bostrom
Greg Durrett
69
214
0
07 Apr 2020
What do Models Learn from Question Answering Datasets?
What do Models Learn from Question Answering Datasets?
Priyanka Sen
Amir Saffari
RALMELM
80
75
0
07 Apr 2020
A Few Topical Tweets are Enough for Effective User-Level Stance
  Detection
A Few Topical Tweets are Enough for Effective User-Level Stance Detection
Younes Samih
Kareem Darwish
29
7
0
07 Apr 2020
QuantNet: Transferring Learning Across Systematic Trading Strategies
QuantNet: Transferring Learning Across Systematic Trading Strategies
Adriano Soares Koshiyama
Sebastian Flennerhag
Stefano B. Blumberg
Nikan B. Firoozye
Philip C. Treleaven
AIFinMQ
86
9
0
07 Apr 2020
KorNLI and KorSTS: New Benchmark Datasets for Korean Natural Language
  Understanding
KorNLI and KorSTS: New Benchmark Datasets for Korean Natural Language Understanding
Jiyeon Ham
Yo Joong Choe
Kyubyong Park
Ilji Choi
Hyungjoon Soh
67
78
0
07 Apr 2020
RYANSQL: Recursively Applying Sketch-based Slot Fillings for Complex
  Text-to-SQL in Cross-Domain Databases
RYANSQL: Recursively Applying Sketch-based Slot Fillings for Complex Text-to-SQL in Cross-Domain Databases
Donghyun Choi
M. Shin
EungGyun Kim
Dong Ryeol Shin
89
130
0
07 Apr 2020
A Sentence Cloze Dataset for Chinese Machine Reading Comprehension
A Sentence Cloze Dataset for Chinese Machine Reading Comprehension
Yiming Cui
Ting Liu
Ziqing Yang
Zhipeng Chen
Wentao Ma
Wanxiang Che
Shijin Wang
Guoping Hu
73
19
0
07 Apr 2020
Knowledge Fusion and Semantic Knowledge Ranking for Open Domain Question
  Answering
Knowledge Fusion and Semantic Knowledge Ranking for Open Domain Question Answering
Pratyay Banerjee
Chitta Baral
RALM
99
24
0
07 Apr 2020
Is Graph Structure Necessary for Multi-hop Question Answering?
Is Graph Structure Necessary for Multi-hop Question Answering?
Nan Shao
Yiming Cui
Ting Liu
Shijin Wang
Guoping Hu
GNN
87
16
0
07 Apr 2020
Previous
123...207208209...213214215
Next