ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXivPDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 4,654 papers shown
Title
Leveraging Multi-Source Weak Social Supervision for Early Detection of
  Fake News
Leveraging Multi-Source Weak Social Supervision for Early Detection of Fake News
Kai Shu
Guoqing Zheng
Yichuan Li
Subhabrata Mukherjee
Ahmed Hassan Awadallah
Scott W. Ruston
Huan Liu
32
54
0
03 Apr 2020
XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training,
  Understanding and Generation
XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation
Yaobo Liang
Nan Duan
Yeyun Gong
Ning Wu
Fenfei Guo
...
Shuguang Liu
Fan Yang
Daniel Fernando Campos
Rangan Majumder
Ming Zhou
ELM
VLM
63
342
0
03 Apr 2020
Deep Entity Matching with Pre-Trained Language Models
Deep Entity Matching with Pre-Trained Language Models
Yuliang Li
Jinfeng Li
Yoshihiko Suhara
A. Doan
W. Tan
VLM
28
373
0
01 Apr 2020
TNT-KID: Transformer-based Neural Tagger for Keyword Identification
TNT-KID: Transformer-based Neural Tagger for Keyword Identification
Matej Martinc
Blaž Škrlj
Senja Pollak
24
37
0
20 Mar 2020
Enhancing Factual Consistency of Abstractive Summarization
Enhancing Factual Consistency of Abstractive Summarization
Chenguang Zhu
William Fu-Hinthorn
Ruochen Xu
Qingkai Zeng
Michael Zeng
Xuedong Huang
Meng Jiang
HILM
KELM
193
40
0
19 Mar 2020
Pre-trained Models for Natural Language Processing: A Survey
Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu
Tianxiang Sun
Yige Xu
Yunfan Shao
Ning Dai
Xuanjing Huang
LM&MA
VLM
246
1,454
0
18 Mar 2020
Transformer Networks for Trajectory Forecasting
Transformer Networks for Trajectory Forecasting
Francesco Giuliari
Irtiza Hasan
Marco Cristani
Fabio Galasso
113
372
0
18 Mar 2020
A Survey on Contextual Embeddings
A Survey on Contextual Embeddings
Qi Liu
Matt J. Kusner
Phil Blunsom
225
146
0
16 Mar 2020
TRANS-BLSTM: Transformer with Bidirectional LSTM for Language
  Understanding
TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding
Zhiheng Huang
Peng Xu
Davis Liang
Ajay K. Mishra
Bing Xiang
15
31
0
16 Mar 2020
Finnish Language Modeling with Deep Transformer Models
Finnish Language Modeling with Deep Transformer Models
Abhilash Jain
Aku Rouhe
Stig-Arne Gronroos
M. Kurimo
14
0
0
14 Mar 2020
Know thy corpus! Robust methods for digital curation of Web corpora
Know thy corpus! Robust methods for digital curation of Web corpora
S. Sharoff
19
8
0
13 Mar 2020
Learning to Encode Position for Transformer with Continuous Dynamical
  Model
Learning to Encode Position for Transformer with Continuous Dynamical Model
Xuanqing Liu
Hsiang-Fu Yu
Inderjit Dhillon
Cho-Jui Hsieh
16
107
0
13 Mar 2020
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video
  Captioning
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning
Zhiyuan Fang
Tejas Gokhale
Pratyay Banerjee
Chitta Baral
Yezhou Yang
23
60
0
11 Mar 2020
Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual
  Lexical Semantic Similarity
Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual Lexical Semantic Similarity
Ivan Vulić
Simon Baker
Edoardo Ponti
Ulla Petti
Ira Leviant
...
Eden Bar
Matt Malone
Thierry Poibeau
Roi Reichart
Anna Korhonen
21
82
0
10 Mar 2020
Efficient Intent Detection with Dual Sentence Encoders
Efficient Intent Detection with Dual Sentence Encoders
I. Casanueva
Tadas Temvcinas
D. Gerz
Matthew Henderson
Ivan Vulić
VLM
180
452
0
10 Mar 2020
Neuro-symbolic Architectures for Context Understanding
Neuro-symbolic Architectures for Context Understanding
A. Oltramari
Jonathan M Francis
C. Henson
Kaixin Ma
Ruwan Wickramarachchi
NAI
AI4CE
25
28
0
09 Mar 2020
Sensitive Data Detection and Classification in Spanish Clinical Text:
  Experiments with BERT
Sensitive Data Detection and Classification in Spanish Clinical Text: Experiments with BERT
Aitor García-Pablos
Naiara Pérez
Montse Cuadros
37
34
0
06 Mar 2020
HypoNLI: Exploring the Artificial Patterns of Hypothesis-only Bias in
  Natural Language Inference
HypoNLI: Exploring the Artificial Patterns of Hypothesis-only Bias in Natural Language Inference
Tianyu Liu
Xin Zheng
Baobao Chang
Zhifang Sui
53
23
0
05 Mar 2020
Kleister: A novel task for Information Extraction involving Long
  Documents with Complex Layout
Kleister: A novel task for Information Extraction involving Long Documents with Complex Layout
Filip Graliñski
Tomasz Stanislawek
Anna Wróblewska
Dawid Lipiñski
Agnieszka Kaliska
Paulina Rosalska
Bartosz Topolski
P. Biecek
30
40
0
04 Mar 2020
A Study on Efficiency, Accuracy and Document Structure for Answer
  Sentence Selection
A Study on Efficiency, Accuracy and Document Structure for Answer Sentence Selection
Daniele Bonadiman
Alessandro Moschitti
RALM
23
10
0
04 Mar 2020
jiant: A Software Toolkit for Research on General-Purpose Text
  Understanding Models
jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models
Yada Pruksachatkun
Philip Yeres
Haokun Liu
Jason Phang
Phu Mon Htut
Alex Jinpeng Wang
Ian Tenney
Samuel R. Bowman
SSeg
14
94
0
04 Mar 2020
Deep Multi-Modal Sets
Deep Multi-Modal Sets
A. Reiter
Menglin Jia
Pu Yang
Ser-Nam Lim
BDL
25
4
0
03 Mar 2020
CLUECorpus2020: A Large-scale Chinese Corpus for Pre-training Language
  Model
CLUECorpus2020: A Large-scale Chinese Corpus for Pre-training Language Model
Liang Xu
Xuanwei Zhang
Qianqian Dong
SSL
19
70
0
03 Mar 2020
Med7: a transferable clinical natural language processing model for
  electronic health records
Med7: a transferable clinical natural language processing model for electronic health records
Andrey Kormilitzin
N. Vaci
Qiang Liu
A. Nevado-Holgado
22
115
0
03 Mar 2020
PhoBERT: Pre-trained language models for Vietnamese
PhoBERT: Pre-trained language models for Vietnamese
Dat Quoc Nguyen
A. Nguyen
174
343
0
02 Mar 2020
AraBERT: Transformer-based Model for Arabic Language Understanding
AraBERT: Transformer-based Model for Arabic Language Understanding
Wissam Antoun
Fady Baly
Hazem M. Hajj
46
950
0
28 Feb 2020
UniLMv2: Pseudo-Masked Language Models for Unified Language Model
  Pre-Training
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training
Hangbo Bao
Li Dong
Furu Wei
Wenhui Wang
Nan Yang
...
Yu-Chiang Frank Wang
Songhao Piao
Jianfeng Gao
Ming Zhou
H. Hon
AI4CE
44
392
0
28 Feb 2020
Learning Representations by Predicting Bags of Visual Words
Learning Representations by Predicting Bags of Visual Words
Spyros Gidaris
Andrei Bursuc
N. Komodakis
P. Pérez
Matthieu Cord
SSL
28
117
0
27 Feb 2020
Multi-task Learning with Multi-head Attention for Multi-choice Reading
  Comprehension
Multi-task Learning with Multi-head Attention for Multi-choice Reading Comprehension
H. Wan
20
13
0
26 Feb 2020
On Feature Normalization and Data Augmentation
On Feature Normalization and Data Augmentation
Boyi Li
Felix Wu
Ser-Nam Lim
Serge J. Belongie
Kilian Q. Weinberger
21
134
0
25 Feb 2020
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression
  of Pre-Trained Transformers
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
Wenhui Wang
Furu Wei
Li Dong
Hangbo Bao
Nan Yang
Ming Zhou
VLM
47
1,214
0
25 Feb 2020
Training Question Answering Models From Synthetic Data
Training Question Answering Models From Synthetic Data
Raul Puri
Ryan Spring
M. Patwary
M. Shoeybi
Bryan Catanzaro
ELM
24
159
0
22 Feb 2020
LAMBERT: Layout-Aware (Language) Modeling for information extraction
LAMBERT: Layout-Aware (Language) Modeling for information extraction
Lukasz Garncarek
Rafal Powalski
Tomasz Stanislawek
Bartosz Topolski
Piotr Halama
M. Turski
Filip Graliñski
8
87
0
19 Feb 2020
From English To Foreign Languages: Transferring Pre-trained Language
  Models
From English To Foreign Languages: Transferring Pre-trained Language Models
Ke M. Tran
30
49
0
18 Feb 2020
A Financial Service Chatbot based on Deep Bidirectional Transformers
A Financial Service Chatbot based on Deep Bidirectional Transformers
S. Yu
Yuxin Chen
Hussain Zaidi
25
33
0
17 Feb 2020
Robustness Verification for Transformers
Robustness Verification for Transformers
Zhouxing Shi
Huan Zhang
Kai-Wei Chang
Minlie Huang
Cho-Jui Hsieh
AAML
24
105
0
16 Feb 2020
Stress Test Evaluation of Transformer-based Models in Natural Language
  Understanding Tasks
Stress Test Evaluation of Transformer-based Models in Natural Language Understanding Tasks
Carlos Aspillaga
Andrés Carvallo
Vladimir Araujo
ELM
47
31
0
14 Feb 2020
FQuAD: French Question Answering Dataset
FQuAD: French Question Answering Dataset
Martin d'Hoffschmidt
Wacim Belblidia
Tom Brendlé
Quentin Heinrich
Maxime Vidal
26
98
0
14 Feb 2020
LaProp: Separating Momentum and Adaptivity in Adam
LaProp: Separating Momentum and Adaptivity in Adam
Liu Ziyin
Zhikang T.Wang
Masahito Ueda
ODL
13
18
0
12 Feb 2020
On Layer Normalization in the Transformer Architecture
On Layer Normalization in the Transformer Architecture
Ruibin Xiong
Yunchang Yang
Di He
Kai Zheng
Shuxin Zheng
Chen Xing
Huishuai Zhang
Yanyan Lan
Liwei Wang
Tie-Yan Liu
AI4CE
35
949
0
12 Feb 2020
Feature Importance Estimation with Self-Attention Networks
Feature Importance Estimation with Self-Attention Networks
Blaž Škrlj
S. Džeroski
Nada Lavrac
Matej Petković
FAtt
MILM
34
51
0
11 Feb 2020
ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning
ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning
Weihao Yu
Zihang Jiang
Yanfei Dong
Jiashi Feng
LRM
25
245
0
11 Feb 2020
Adversarial Filters of Dataset Biases
Adversarial Filters of Dataset Biases
Ronan Le Bras
Swabha Swayamdipta
Chandra Bhagavatula
Rowan Zellers
Matthew E. Peters
Ashish Sabharwal
Yejin Choi
36
220
0
10 Feb 2020
REALM: Retrieval-Augmented Language Model Pre-Training
REALM: Retrieval-Augmented Language Model Pre-Training
Kelvin Guu
Kenton Lee
Zora Tung
Panupong Pasupat
Ming-Wei Chang
RALM
48
2,006
0
10 Feb 2020
Towards Crowdsourced Training of Large Neural Networks using
  Decentralized Mixture-of-Experts
Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts
Max Ryabinin
Anton I. Gusev
FedML
27
48
0
10 Feb 2020
Pre-training Tasks for Embedding-based Large-scale Retrieval
Pre-training Tasks for Embedding-based Large-scale Retrieval
Wei-Cheng Chang
Felix X. Yu
Yin-Wen Chang
Yiming Yang
Sanjiv Kumar
RALM
13
301
0
10 Feb 2020
Segmented Graph-Bert for Graph Instance Modeling
Segmented Graph-Bert for Graph Instance Modeling
Jiawei Zhang
SSeg
25
6
0
09 Feb 2020
MA-DST: Multi-Attention Based Scalable Dialog State Tracking
MA-DST: Multi-Attention Based Scalable Dialog State Tracking
Adarsh Kumar
Peter Ku
Anuj Kumar Goyal
A. Metallinou
Dilek Z. Hakkani-Tür
27
58
0
07 Feb 2020
perm2vec: Graph Permutation Selection for Decoding of Error Correction
  Codes using Self-Attention
perm2vec: Graph Permutation Selection for Decoding of Error Correction Codes using Self-Attention
Nir Raviv
Avi Caciularu
Tomer Raviv
Jacob Goldberger
Yair Be’ery
23
8
0
06 Feb 2020
K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters
K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters
Ruize Wang
Duyu Tang
Nan Duan
Zhongyu Wei
Xuanjing Huang
Jianshu Ji
Guihong Cao
Daxin Jiang
Ming Zhou
KELM
48
545
0
05 Feb 2020
Previous
123...9091929394
Next