ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXivPDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 4,740 papers shown
Title
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason
  Over Implicit Knowledge
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge
Alon Talmor
Oyvind Tafjord
Peter Clark
Yoav Goldberg
Jonathan Berant
ReLM
LRM
36
39
0
11 Jun 2020
CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot
  Cross-Lingual NLP
CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot Cross-Lingual NLP
Libo Qin
Minheng Ni
Yue Zhang
Wanxiang Che
40
149
0
11 Jun 2020
A Monolingual Approach to Contextualized Word Embeddings for
  Mid-Resource Languages
A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages
Pedro Ortiz Suarez
Laurent Romary
Benoît Sagot
28
227
0
11 Jun 2020
Report from the NSF Future Directions Workshop, Toward User-Oriented
  Agents: Research Directions and Challenges
Report from the NSF Future Directions Workshop, Toward User-Oriented Agents: Research Directions and Challenges
M. Eskénazi
Tiancheng Zhao
LLMAG
AI4TS
AI4CE
36
9
0
10 Jun 2020
Revisiting Few-sample BERT Fine-tuning
Revisiting Few-sample BERT Fine-tuning
Tianyi Zhang
Felix Wu
Arzoo Katiyar
Kilian Q. Weinberger
Yoav Artzi
41
442
0
10 Jun 2020
Combination of abstractive and extractive approaches for summarization
  of long scientific texts
Combination of abstractive and extractive approaches for summarization of long scientific texts
Vladislav Tretyak
Denis Stepanov
21
10
0
09 Jun 2020
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and
  Strong Baselines
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines
Marius Mosbach
Maksym Andriushchenko
Dietrich Klakow
31
354
0
08 Jun 2020
Linformer: Self-Attention with Linear Complexity
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
87
1,655
0
08 Jun 2020
An Overview of Neural Network Compression
An Overview of Neural Network Compression
James OÑeill
AI4CE
45
98
0
05 Jun 2020
DeCLUTR: Deep Contrastive Learning for Unsupervised Textual
  Representations
DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations
John Giorgi
Osvald Nitski
Bo Wang
Gary D. Bader
SSL
39
490
0
05 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
64
2,631
0
05 Jun 2020
Sponge Examples: Energy-Latency Attacks on Neural Networks
Sponge Examples: Energy-Latency Attacks on Neural Networks
Ilia Shumailov
Yiren Zhao
Daniel Bates
Nicolas Papernot
Robert D. Mullins
Ross J. Anderson
SILM
19
127
0
05 Jun 2020
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient
  Language Processing
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
Zihang Dai
Guokun Lai
Yiming Yang
Quoc V. Le
48
230
0
05 Jun 2020
A Survey on Transfer Learning in Natural Language Processing
A Survey on Transfer Learning in Natural Language Processing
Zaid Alyafeai
Maged S. Alshaibani
Irfan Ahmad
30
72
0
31 May 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
124
40,394
0
28 May 2020
Syntactic Structure Distillation Pretraining For Bidirectional Encoders
Syntactic Structure Distillation Pretraining For Bidirectional Encoders
A. Kuncoro
Lingpeng Kong
Daniel Fried
Dani Yogatama
Laura Rimell
Chris Dyer
Phil Blunsom
31
33
0
27 May 2020
CausaLM: Causal Model Explanation Through Counterfactual Language Models
CausaLM: Causal Model Explanation Through Counterfactual Language Models
Amir Feder
Nadav Oved
Uri Shalit
Roi Reichart
CML
LRM
49
157
0
27 May 2020
Rationalizing Text Matching: Learning Sparse Alignments via Optimal
  Transport
Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport
Kyle Swanson
L. Yu
Tao Lei
OT
29
37
0
27 May 2020
GECToR -- Grammatical Error Correction: Tag, Not Rewrite
GECToR -- Grammatical Error Correction: Tag, Not Rewrite
Kostiantyn Omelianchuk
Vitaliy Atrasevych
Artem Chernodub
Oleksandr Skurzhanskyi
36
305
0
26 May 2020
NILE : Natural Language Inference with Faithful Natural Language
  Explanations
NILE : Natural Language Inference with Faithful Natural Language Explanations
Sawan Kumar
Partha P. Talukdar
XAI
LRM
22
160
0
25 May 2020
Sentiment Analysis: Automatically Detecting Valence, Emotions, and Other
  Affectual States from Text
Sentiment Analysis: Automatically Detecting Valence, Emotions, and Other Affectual States from Text
Saif M. Mohammad
27
312
0
25 May 2020
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge
  Injection into Pretrained Transformers
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers
Anne Lauscher
Olga Majewska
Leonardo F. R. Ribeiro
Iryna Gurevych
Nikolai Rozanov
Goran Glavaš
KELM
39
79
0
24 May 2020
L2R2: Leveraging Ranking for Abductive Reasoning
L2R2: Leveraging Ranking for Abductive Reasoning
Yunchang Zhu
Liang Pang
Yanyan Lan
Xueqi Cheng
24
14
0
22 May 2020
Pretraining with Contrastive Sentence Objectives Improves Discourse
  Performance of Language Models
Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models
Dan Iter
Kelvin Guu
L. Lansing
Dan Jurafsky
6
77
0
20 May 2020
BERTweet: A pre-trained language model for English Tweets
BERTweet: A pre-trained language model for English Tweets
Dat Quoc Nguyen
Thanh Tien Vu
A. Nguyen
VLM
34
902
0
20 May 2020
SciSight: Combining faceted navigation and research group detection for
  COVID-19 exploratory scientific search
SciSight: Combining faceted navigation and research group detection for COVID-19 exploratory scientific search
Tom Hope
Jason Portenoy
Kishore Vasan
Jon Borchardt
Eric Horvitz
Daniel S. Weld
Marti A. Hearst
Jevin D. West
FedML
26
58
0
20 May 2020
BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based
  Quantized DNNs
BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based Quantized DNNs
Yongkweon Jeon
Baeseong Park
S. Kwon
Byeongwook Kim
Jeongin Yun
Dongsoo Lee
MQ
33
30
0
20 May 2020
A Recipe for Creating Multimodal Aligned Datasets for Sequential Tasks
A Recipe for Creating Multimodal Aligned Datasets for Sequential Tasks
Angela S. Lin
Sudha Rao
Asli Celikyilmaz
E. Nouri
Chris Brockett
Debadeepta Dey
Bill Dolan
26
24
0
19 May 2020
Normalized Attention Without Probability Cage
Normalized Attention Without Probability Cage
Oliver Richter
Roger Wattenhofer
14
21
0
19 May 2020
Human Instruction-Following with Deep Reinforcement Learning via
  Transfer-Learning from Text
Human Instruction-Following with Deep Reinforcement Learning via Transfer-Learning from Text
Felix Hill
Soňa Mokrá
Nathaniel Wong
Tim Harley
LM&Ro
24
81
0
19 May 2020
Table Search Using a Deep Contextualized Language Model
Table Search Using a Deep Contextualized Language Model
Zhiyu Zoey Chen
M. Trabelsi
J. Heflin
Yinan Xu
Brian D. Davison
LMTD
23
56
0
19 May 2020
Quantifying the Uncertainty of Precision Estimates for Rule based Text
  Classifiers
Quantifying the Uncertainty of Precision Estimates for Rule based Text Classifiers
J. Nutaro
Özgür Özmen
21
0
0
19 May 2020
Are All Languages Created Equal in Multilingual BERT?
Are All Languages Created Equal in Multilingual BERT?
Shijie Wu
Mark Dredze
25
316
0
18 May 2020
Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained
  Conversational Representations
Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations
Sam Coope
Tyler Farghly
D. Gerz
Ivan Vulić
Matthew Henderson
27
62
0
18 May 2020
Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio
  Representation
Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio Representation
Po-Han Chi
Pei-Hung Chung
Tsung-Han Wu
Chun-Cheng Hsieh
Yen-Hao Chen
Shang-Wen Li
Hung-yi Lee
SSL
9
147
0
18 May 2020
Syntax-guided Controlled Generation of Paraphrases
Syntax-guided Controlled Generation of Paraphrases
Ashutosh Kumar
Kabir Ahuja
Raghuram Vadapalli
Partha P. Talukdar
29
93
0
18 May 2020
T-VSE: Transformer-Based Visual Semantic Embedding
T-VSE: Transformer-Based Visual Semantic Embedding
M. Bastan
Arnau Ramisa
Mehmet Tek
ViT
24
7
0
17 May 2020
ApplicaAI at SemEval-2020 Task 11: On RoBERTa-CRF, Span CLS and Whether
  Self-Training Helps Them
ApplicaAI at SemEval-2020 Task 11: On RoBERTa-CRF, Span CLS and Whether Self-Training Helps Them
Dawid Jurkiewicz
Łukasz Borchmann
Izabela Kosmala
Filip Graliñski
22
40
0
16 May 2020
Movement Pruning: Adaptive Sparsity by Fine-Tuning
Movement Pruning: Adaptive Sparsity by Fine-Tuning
Victor Sanh
Thomas Wolf
Alexander M. Rush
32
472
0
15 May 2020
COVID-Twitter-BERT: A Natural Language Processing Model to Analyse
  COVID-19 Content on Twitter
COVID-Twitter-BERT: A Natural Language Processing Model to Analyse COVID-19 Content on Twitter
Martin Müller
M. Salathé
P. Kummervold
VLM
MedIm
AI4MH
35
355
0
15 May 2020
Spelling Error Correction with Soft-Masked BERT
Spelling Error Correction with Soft-Masked BERT
Shaohua Zhang
Haoran Huang
Jicong Liu
Hang Li
22
206
0
15 May 2020
Dense-Caption Matching and Frame-Selection Gating for Temporal
  Localization in VideoQA
Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA
Hyounghun Kim
Zineng Tang
Joey Tianyi Zhou
33
31
0
13 May 2020
INFOTABS: Inference on Tables as Semi-structured Data
INFOTABS: Inference on Tables as Semi-structured Data
Vivek Gupta
Maitrey Mehta
Pegah Nokhiz
Vivek Srikumar
LMTD
25
100
0
13 May 2020
That is a Known Lie: Detecting Previously Fact-Checked Claims
That is a Known Lie: Detecting Previously Fact-Checked Claims
Shaden Shaar
Giovanni Da San Martino
Nikolay Babulkov
Preslav Nakov
HILM
59
154
0
12 May 2020
A Report on the 2020 Sarcasm Detection Shared Task
A Report on the 2020 Sarcasm Detection Shared Task
Debanjan Ghosh
Avijit Vajpayee
Smaranda Muresan
24
60
0
12 May 2020
WinoWhy: A Deep Diagnosis of Essential Commonsense Knowledge for
  Answering Winograd Schema Challenge
WinoWhy: A Deep Diagnosis of Essential Commonsense Knowledge for Answering Winograd Schema Challenge
Hongming Zhang
Xinran Zhao
Yangqiu Song
24
54
0
12 May 2020
On the Robustness of Language Encoders against Grammatical Errors
On the Robustness of Language Encoders against Grammatical Errors
Fan Yin
Quanyu Long
Tao Meng
Kai-Wei Chang
39
34
0
12 May 2020
SOLOIST: Building Task Bots at Scale with Transfer Learning and Machine
  Teaching
SOLOIST: Building Task Bots at Scale with Transfer Learning and Machine Teaching
Baolin Peng
Chunyuan Li
Jinchao Li
Shahin Shayandeh
Lars Liden
Jianfeng Gao
33
125
0
11 May 2020
Commonsense Evidence Generation and Injection in Reading Comprehension
Commonsense Evidence Generation and Injection in Reading Comprehension
Ye Liu
Tao Yang
Zeyu You
Wei Fan
Philip S. Yu
33
14
0
11 May 2020
schuBERT: Optimizing Elements of BERT
schuBERT: Optimizing Elements of BERT
A. Khetan
Zohar Karnin
31
30
0
09 May 2020
Previous
123...899091...939495
Next