ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXivPDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 9,299 papers shown
Title
Does the Whole Exceed its Parts? The Effect of AI Explanations on
  Complementary Team Performance
Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance
Gagan Bansal
Tongshuang Wu
Joyce Zhou
Raymond Fok
Besmira Nushi
Ece Kamar
Marco Tulio Ribeiro
Daniel S. Weld
52
588
0
26 Jun 2020
The Depth-to-Width Interplay in Self-Attention
The Depth-to-Width Interplay in Self-Attention
Yoav Levine
Noam Wies
Or Sharir
Hofit Bata
Amnon Shashua
40
46
0
22 Jun 2020
Open-Domain Conversational Agents: Current Progress, Open Problems, and
  Future Directions
Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions
Stephen Roller
Y-Lan Boureau
Jason Weston
Antoine Bordes
Emily Dinan
...
Kurt Shuster
Eric Michael Smith
Arthur Szlam
Jack Urbanek
Mary Williamson
LLMAG
AI4CE
36
51
0
22 Jun 2020
MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of
  Gradients
MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of Gradients
Chenfei Zhu
Yu Cheng
Zhe Gan
Furong Huang
Jingjing Liu
Tom Goldstein
ODL
40
2
0
21 Jun 2020
A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics and
  Benchmark Datasets
A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics and Benchmark Datasets
Chengchang Zeng
Shaobo Li
Qin Li
Jie Hu
Jianjun Hu
49
101
0
21 Jun 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech
  Representations
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
10
5,647
0
20 Jun 2020
Cross-lingual Retrieval for Iterative Self-Supervised Training
Cross-lingual Retrieval for Iterative Self-Supervised Training
C. Tran
Y. Tang
Xian Li
Jiatao Gu
RALM
33
74
0
16 Jun 2020
On the Computational Power of Transformers and its Implications in
  Sequence Modeling
On the Computational Power of Transformers and its Implications in Sequence Modeling
S. Bhattamishra
Arkil Patel
Navin Goyal
43
66
0
16 Jun 2020
PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized
  Embedding Models
PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized Embedding Models
Eyal Ben-David
Carmel Rabinovitz
Roi Reichart
SSL
71
62
0
16 Jun 2020
Minimum Width for Universal Approximation
Minimum Width for Universal Approximation
Sejun Park
Chulhee Yun
Jaeho Lee
Jinwoo Shin
42
122
0
16 Jun 2020
To Pretrain or Not to Pretrain: Examining the Benefits of Pretraining on
  Resource Rich Tasks
To Pretrain or Not to Pretrain: Examining the Benefits of Pretraining on Resource Rich Tasks
Sinong Wang
Madian Khabsa
Hao Ma
18
26
0
15 Jun 2020
Self-supervised Learning: Generative or Contrastive
Self-supervised Learning: Generative or Contrastive
Xiao Liu
Fanjin Zhang
Zhenyu Hou
Zhaoyu Wang
Li Mian
Jing Zhang
Jie Tang
SSL
64
1,597
0
15 Jun 2020
Comparing Natural Language Processing Techniques for Alzheimer's
  Dementia Prediction in Spontaneous Speech
Comparing Natural Language Processing Techniques for Alzheimer's Dementia Prediction in Spontaneous Speech
Thomas Searle
Zina M. Ibrahim
Richard J. B. Dobson
14
46
0
12 Jun 2020
SemEval-2020 Task 12: Multilingual Offensive Language Identification in
  Social Media (OffensEval 2020)
SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)
Marcos Zampieri
Preslav Nakov
Sara Rosenthal
Pepa Atanasova
Georgi Karadzhov
Hamdy Mubarak
Leon Derczynski
Zeses Pitenis
cCaugri cColtekin
30
484
0
12 Jun 2020
Video Understanding as Machine Translation
Bruno Korbar
Fabio Petroni
Rohit Girdhar
Lorenzo Torresani
SSL
20
29
0
12 Jun 2020
Rethinking Pre-training and Self-training
Rethinking Pre-training and Self-training
Barret Zoph
Golnaz Ghiasi
Nayeon Lee
Huayu Chen
Hanxiao Liu
E. D. Cubuk
Quoc V. Le
SSeg
53
646
0
11 Jun 2020
VirTex: Learning Visual Representations from Textual Annotations
VirTex: Learning Visual Representations from Textual Annotations
Karan Desai
Justin Johnson
SSL
VLM
55
433
0
11 Jun 2020
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason
  Over Implicit Knowledge
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge
Alon Talmor
Oyvind Tafjord
Peter Clark
Yoav Goldberg
Jonathan Berant
ReLM
LRM
41
38
0
11 Jun 2020
CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot
  Cross-Lingual NLP
CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot Cross-Lingual NLP
Libo Qin
Minheng Ni
Yue Zhang
Wanxiang Che
45
149
0
11 Jun 2020
A Monolingual Approach to Contextualized Word Embeddings for
  Mid-Resource Languages
A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages
Pedro Ortiz Suarez
Laurent Romary
Benoît Sagot
33
228
0
11 Jun 2020
Report from the NSF Future Directions Workshop, Toward User-Oriented
  Agents: Research Directions and Challenges
Report from the NSF Future Directions Workshop, Toward User-Oriented Agents: Research Directions and Challenges
M. Eskénazi
Tiancheng Zhao
LLMAG
AI4TS
AI4CE
43
9
0
10 Jun 2020
Revisiting Few-sample BERT Fine-tuning
Revisiting Few-sample BERT Fine-tuning
Tianyi Zhang
Felix Wu
Arzoo Katiyar
Kilian Q. Weinberger
Yoav Artzi
46
443
0
10 Jun 2020
Combination of abstractive and extractive approaches for summarization
  of long scientific texts
Combination of abstractive and extractive approaches for summarization of long scientific texts
Vladislav Tretyak
Denis Stepanov
21
10
0
09 Jun 2020
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and
  Strong Baselines
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines
Marius Mosbach
Maksym Andriushchenko
Dietrich Klakow
31
354
0
08 Jun 2020
Linformer: Self-Attention with Linear Complexity
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
123
1,668
0
08 Jun 2020
A Cross-Task Analysis of Text Span Representations
A Cross-Task Analysis of Text Span Representations
Shubham Toshniwal
Freda Shi
Bowen Shi
Lingyu Gao
Karen Livescu
Kevin Gimpel
53
36
0
06 Jun 2020
An Overview of Neural Network Compression
An Overview of Neural Network Compression
James OÑeill
AI4CE
59
98
0
05 Jun 2020
DeCLUTR: Deep Contrastive Learning for Unsupervised Textual
  Representations
DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations
John Giorgi
Osvald Nitski
Bo Wang
Gary D. Bader
SSL
61
494
0
05 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
69
2,660
0
05 Jun 2020
Sponge Examples: Energy-Latency Attacks on Neural Networks
Sponge Examples: Energy-Latency Attacks on Neural Networks
Ilia Shumailov
Yiren Zhao
Daniel Bates
Nicolas Papernot
Robert D. Mullins
Ross J. Anderson
SILM
24
129
0
05 Jun 2020
GMAT: Global Memory Augmentation for Transformers
GMAT: Global Memory Augmentation for Transformers
Ankit Gupta
Jonathan Berant
RALM
21
50
0
05 Jun 2020
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient
  Language Processing
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
Zihang Dai
Guokun Lai
Yiming Yang
Quoc V. Le
48
230
0
05 Jun 2020
Conversational Machine Comprehension: a Literature Review
Conversational Machine Comprehension: a Literature Review
Somil Gupta
Bhanu Pratap Singh Rawat
Hong Yu
29
22
0
01 Jun 2020
A Survey on Transfer Learning in Natural Language Processing
A Survey on Transfer Learning in Natural Language Processing
Zaid Alyafeai
Maged S. Alshaibani
Irfan Ahmad
35
72
0
31 May 2020
Neural Entity Linking: A Survey of Models Based on Deep Learning
Neural Entity Linking: A Survey of Models Based on Deep Learning
Ozge Sevgili
Artem Shelmanov
Mikhail V. Arkhipov
Alexander Panchenko
Christian Biemann
VLM
3DV
AI4TS
33
119
0
31 May 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
190
40,776
0
28 May 2020
Syntactic Structure Distillation Pretraining For Bidirectional Encoders
Syntactic Structure Distillation Pretraining For Bidirectional Encoders
A. Kuncoro
Lingpeng Kong
Daniel Fried
Dani Yogatama
Laura Rimell
Chris Dyer
Phil Blunsom
36
33
0
27 May 2020
CausaLM: Causal Model Explanation Through Counterfactual Language Models
CausaLM: Causal Model Explanation Through Counterfactual Language Models
Amir Feder
Nadav Oved
Uri Shalit
Roi Reichart
CML
LRM
56
158
0
27 May 2020
Rationalizing Text Matching: Learning Sparse Alignments via Optimal
  Transport
Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport
Kyle Swanson
L. Yu
Tao Lei
OT
29
37
0
27 May 2020
GECToR -- Grammatical Error Correction: Tag, Not Rewrite
GECToR -- Grammatical Error Correction: Tag, Not Rewrite
Kostiantyn Omelianchuk
Vitaliy Atrasevych
Artem Chernodub
Oleksandr Skurzhanskyi
41
307
0
26 May 2020
NILE : Natural Language Inference with Faithful Natural Language
  Explanations
NILE : Natural Language Inference with Faithful Natural Language Explanations
Sawan Kumar
Partha P. Talukdar
XAI
LRM
45
162
0
25 May 2020
Sentiment Analysis: Automatically Detecting Valence, Emotions, and Other
  Affectual States from Text
Sentiment Analysis: Automatically Detecting Valence, Emotions, and Other Affectual States from Text
Saif M. Mohammad
27
313
0
25 May 2020
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge
  Injection into Pretrained Transformers
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers
Anne Lauscher
Olga Majewska
Leonardo F. R. Ribeiro
Iryna Gurevych
Nikolai Rozanov
Goran Glavaš
KELM
39
80
0
24 May 2020
L2R2: Leveraging Ranking for Abductive Reasoning
L2R2: Leveraging Ranking for Abductive Reasoning
Yunchang Zhu
Liang Pang
Yanyan Lan
Xueqi Cheng
24
14
0
22 May 2020
Pretraining with Contrastive Sentence Objectives Improves Discourse
  Performance of Language Models
Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models
Dan Iter
Kelvin Guu
L. Lansing
Dan Jurafsky
11
78
0
20 May 2020
BERTweet: A pre-trained language model for English Tweets
BERTweet: A pre-trained language model for English Tweets
Dat Quoc Nguyen
Thanh Tien Vu
A. Nguyen
VLM
40
905
0
20 May 2020
SciSight: Combining faceted navigation and research group detection for
  COVID-19 exploratory scientific search
SciSight: Combining faceted navigation and research group detection for COVID-19 exploratory scientific search
Tom Hope
Jason Portenoy
Kishore Vasan
Jon Borchardt
Eric Horvitz
Daniel S. Weld
Marti A. Hearst
Jevin D. West
FedML
26
58
0
20 May 2020
BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based
  Quantized DNNs
BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based Quantized DNNs
Yongkweon Jeon
Baeseong Park
S. Kwon
Byeongwook Kim
Jeongin Yun
Dongsoo Lee
MQ
38
30
0
20 May 2020
A Recipe for Creating Multimodal Aligned Datasets for Sequential Tasks
A Recipe for Creating Multimodal Aligned Datasets for Sequential Tasks
Angela S. Lin
Sudha Rao
Asli Celikyilmaz
E. Nouri
Chris Brockett
Debadeepta Dey
Bill Dolan
44
25
0
19 May 2020
Normalized Attention Without Probability Cage
Normalized Attention Without Probability Cage
Oliver Richter
Roger Wattenhofer
35
21
0
19 May 2020
Previous
123...179180181...184185186
Next