Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1907.11692
Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach
26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"RoBERTa: A Robustly Optimized BERT Pretraining Approach"
50 / 4,740 papers shown
Title
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge
Alon Talmor
Oyvind Tafjord
Peter Clark
Yoav Goldberg
Jonathan Berant
ReLM
LRM
36
39
0
11 Jun 2020
CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot Cross-Lingual NLP
Libo Qin
Minheng Ni
Yue Zhang
Wanxiang Che
40
149
0
11 Jun 2020
A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages
Pedro Ortiz Suarez
Laurent Romary
Benoît Sagot
28
227
0
11 Jun 2020
Report from the NSF Future Directions Workshop, Toward User-Oriented Agents: Research Directions and Challenges
M. Eskénazi
Tiancheng Zhao
LLMAG
AI4TS
AI4CE
36
9
0
10 Jun 2020
Revisiting Few-sample BERT Fine-tuning
Tianyi Zhang
Felix Wu
Arzoo Katiyar
Kilian Q. Weinberger
Yoav Artzi
41
442
0
10 Jun 2020
Combination of abstractive and extractive approaches for summarization of long scientific texts
Vladislav Tretyak
Denis Stepanov
21
10
0
09 Jun 2020
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines
Marius Mosbach
Maksym Andriushchenko
Dietrich Klakow
31
354
0
08 Jun 2020
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
87
1,655
0
08 Jun 2020
An Overview of Neural Network Compression
James OÑeill
AI4CE
45
98
0
05 Jun 2020
DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations
John Giorgi
Osvald Nitski
Bo Wang
Gary D. Bader
SSL
39
490
0
05 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
64
2,631
0
05 Jun 2020
Sponge Examples: Energy-Latency Attacks on Neural Networks
Ilia Shumailov
Yiren Zhao
Daniel Bates
Nicolas Papernot
Robert D. Mullins
Ross J. Anderson
SILM
19
127
0
05 Jun 2020
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
Zihang Dai
Guokun Lai
Yiming Yang
Quoc V. Le
48
230
0
05 Jun 2020
A Survey on Transfer Learning in Natural Language Processing
Zaid Alyafeai
Maged S. Alshaibani
Irfan Ahmad
30
72
0
31 May 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
124
40,394
0
28 May 2020
Syntactic Structure Distillation Pretraining For Bidirectional Encoders
A. Kuncoro
Lingpeng Kong
Daniel Fried
Dani Yogatama
Laura Rimell
Chris Dyer
Phil Blunsom
31
33
0
27 May 2020
CausaLM: Causal Model Explanation Through Counterfactual Language Models
Amir Feder
Nadav Oved
Uri Shalit
Roi Reichart
CML
LRM
49
157
0
27 May 2020
Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport
Kyle Swanson
L. Yu
Tao Lei
OT
29
37
0
27 May 2020
GECToR -- Grammatical Error Correction: Tag, Not Rewrite
Kostiantyn Omelianchuk
Vitaliy Atrasevych
Artem Chernodub
Oleksandr Skurzhanskyi
36
305
0
26 May 2020
NILE : Natural Language Inference with Faithful Natural Language Explanations
Sawan Kumar
Partha P. Talukdar
XAI
LRM
22
160
0
25 May 2020
Sentiment Analysis: Automatically Detecting Valence, Emotions, and Other Affectual States from Text
Saif M. Mohammad
27
312
0
25 May 2020
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers
Anne Lauscher
Olga Majewska
Leonardo F. R. Ribeiro
Iryna Gurevych
Nikolai Rozanov
Goran Glavaš
KELM
39
79
0
24 May 2020
L2R2: Leveraging Ranking for Abductive Reasoning
Yunchang Zhu
Liang Pang
Yanyan Lan
Xueqi Cheng
24
14
0
22 May 2020
Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models
Dan Iter
Kelvin Guu
L. Lansing
Dan Jurafsky
6
77
0
20 May 2020
BERTweet: A pre-trained language model for English Tweets
Dat Quoc Nguyen
Thanh Tien Vu
A. Nguyen
VLM
34
902
0
20 May 2020
SciSight: Combining faceted navigation and research group detection for COVID-19 exploratory scientific search
Tom Hope
Jason Portenoy
Kishore Vasan
Jon Borchardt
Eric Horvitz
Daniel S. Weld
Marti A. Hearst
Jevin D. West
FedML
26
58
0
20 May 2020
BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based Quantized DNNs
Yongkweon Jeon
Baeseong Park
S. Kwon
Byeongwook Kim
Jeongin Yun
Dongsoo Lee
MQ
33
30
0
20 May 2020
A Recipe for Creating Multimodal Aligned Datasets for Sequential Tasks
Angela S. Lin
Sudha Rao
Asli Celikyilmaz
E. Nouri
Chris Brockett
Debadeepta Dey
Bill Dolan
26
24
0
19 May 2020
Normalized Attention Without Probability Cage
Oliver Richter
Roger Wattenhofer
14
21
0
19 May 2020
Human Instruction-Following with Deep Reinforcement Learning via Transfer-Learning from Text
Felix Hill
Soňa Mokrá
Nathaniel Wong
Tim Harley
LM&Ro
24
81
0
19 May 2020
Table Search Using a Deep Contextualized Language Model
Zhiyu Zoey Chen
M. Trabelsi
J. Heflin
Yinan Xu
Brian D. Davison
LMTD
23
56
0
19 May 2020
Quantifying the Uncertainty of Precision Estimates for Rule based Text Classifiers
J. Nutaro
Özgür Özmen
21
0
0
19 May 2020
Are All Languages Created Equal in Multilingual BERT?
Shijie Wu
Mark Dredze
25
316
0
18 May 2020
Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations
Sam Coope
Tyler Farghly
D. Gerz
Ivan Vulić
Matthew Henderson
27
62
0
18 May 2020
Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio Representation
Po-Han Chi
Pei-Hung Chung
Tsung-Han Wu
Chun-Cheng Hsieh
Yen-Hao Chen
Shang-Wen Li
Hung-yi Lee
SSL
9
147
0
18 May 2020
Syntax-guided Controlled Generation of Paraphrases
Ashutosh Kumar
Kabir Ahuja
Raghuram Vadapalli
Partha P. Talukdar
29
93
0
18 May 2020
T-VSE: Transformer-Based Visual Semantic Embedding
M. Bastan
Arnau Ramisa
Mehmet Tek
ViT
24
7
0
17 May 2020
ApplicaAI at SemEval-2020 Task 11: On RoBERTa-CRF, Span CLS and Whether Self-Training Helps Them
Dawid Jurkiewicz
Łukasz Borchmann
Izabela Kosmala
Filip Graliñski
22
40
0
16 May 2020
Movement Pruning: Adaptive Sparsity by Fine-Tuning
Victor Sanh
Thomas Wolf
Alexander M. Rush
32
472
0
15 May 2020
COVID-Twitter-BERT: A Natural Language Processing Model to Analyse COVID-19 Content on Twitter
Martin Müller
M. Salathé
P. Kummervold
VLM
MedIm
AI4MH
35
355
0
15 May 2020
Spelling Error Correction with Soft-Masked BERT
Shaohua Zhang
Haoran Huang
Jicong Liu
Hang Li
22
206
0
15 May 2020
Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA
Hyounghun Kim
Zineng Tang
Joey Tianyi Zhou
33
31
0
13 May 2020
INFOTABS: Inference on Tables as Semi-structured Data
Vivek Gupta
Maitrey Mehta
Pegah Nokhiz
Vivek Srikumar
LMTD
25
100
0
13 May 2020
That is a Known Lie: Detecting Previously Fact-Checked Claims
Shaden Shaar
Giovanni Da San Martino
Nikolay Babulkov
Preslav Nakov
HILM
59
154
0
12 May 2020
A Report on the 2020 Sarcasm Detection Shared Task
Debanjan Ghosh
Avijit Vajpayee
Smaranda Muresan
24
60
0
12 May 2020
WinoWhy: A Deep Diagnosis of Essential Commonsense Knowledge for Answering Winograd Schema Challenge
Hongming Zhang
Xinran Zhao
Yangqiu Song
24
54
0
12 May 2020
On the Robustness of Language Encoders against Grammatical Errors
Fan Yin
Quanyu Long
Tao Meng
Kai-Wei Chang
39
34
0
12 May 2020
SOLOIST: Building Task Bots at Scale with Transfer Learning and Machine Teaching
Baolin Peng
Chunyuan Li
Jinchao Li
Shahin Shayandeh
Lars Liden
Jianfeng Gao
33
125
0
11 May 2020
Commonsense Evidence Generation and Injection in Reading Comprehension
Ye Liu
Tao Yang
Zeyu You
Wei Fan
Philip S. Yu
33
14
0
11 May 2020
schuBERT: Optimizing Elements of BERT
A. Khetan
Zohar Karnin
31
30
0
09 May 2020
Previous
1
2
3
...
89
90
91
...
93
94
95
Next