Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04805
Cited By
v1
v2 (latest)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
50 / 23,641 papers shown
Title
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning
Jun Chen
Han Guo
Kai Yi
Boyang Albert Li
Mohamed Elhoseiny
VLM
166
227
0
20 Feb 2021
On Calibration and Out-of-domain Generalization
Yoav Wald
Amir Feder
D. Greenfeld
Uri Shalit
OODD
151
158
0
20 Feb 2021
Evolving Attention with Residual Convolutions
Yujing Wang
Yaming Yang
Jiangang Bai
Mingliang Zhang
Jing Bai
Jiahao Yu
Ce Zhang
Gao Huang
Yunhai Tong
ViT
112
34
0
20 Feb 2021
Predicting times of waiting on red signals using BERT
Witold Szejgis
Anna Warno
P. Góra
32
1
0
20 Feb 2021
Multilingual Answer Sentence Reranking via Automatically Translated Data
Thuy Vu
Alessandro Moschitti
66
5
0
20 Feb 2021
Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction
Benfeng Xu
Quan Wang
Yajuan Lyu
Yong Zhu
Zhendong Mao
109
170
0
20 Feb 2021
Hard-Attention for Scalable Image Classification
Athanasios Papadopoulos
Pawel Korus
N. Memon
125
25
0
20 Feb 2021
Latent Variable Sequential Set Transformers For Joint Multi-Agent Motion Prediction
Roger Girgis
Florian Golemo
Felipe Codevilla
Martin Weiss
Jim Aldon D’Souza
Samira Ebrahimi Kahou
Felix Heide
C. Pal
94
133
0
19 Feb 2021
Hate-Alert@DravidianLangTech-EACL2021: Ensembling strategies for Transformer-based Offensive language Detection
Debjoy Saha
Naman Paharia
Debajit Chakraborty
Punyajoy Saha
Animesh Mukherjee
39
38
0
19 Feb 2021
Analyzing Curriculum Learning for Sentiment Analysis along Task Difficulty, Pacing and Visualization Axes
Anvesh Rao Vijjini
Kaveri Anuranjana
R. Mamidi
70
3
0
19 Feb 2021
Towards Emotion Recognition in Hindi-English Code-Mixed Data: A Transformer Based Approach
Anshul Wadhawan
Akshita Aggarwal
66
32
0
19 Feb 2021
Back to Prior Knowledge: Joint Event Causality Extraction via Convolutional Semantic Infusion
Zijian Wang
Hao Wang
Xiangfeng Luo
Jianqi Gao
25
4
0
19 Feb 2021
End-to-End Neural Systems for Automatic Children Speech Recognition: An Empirical Study
Prashanth Gurunath Shivakumar
Shrikanth Narayanan
58
54
0
19 Feb 2021
Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input
Brooke Stephenson
Thomas Hueber
Laurent Girin
Laurent Besacier
91
10
0
19 Feb 2021
An Empirical Study on Measuring the Similarity of Sentential Arguments with Language Model Domain Adaptation
Yujin Baek
Sang-gyu Seo
36
0
0
19 Feb 2021
Scaling Creative Inspiration with Fine-Grained Functional Aspects of Ideas
Tom Hope
Ronen Tamari
Hyeonsu B Kang
Daniel Hershcovich
Joel Chan
A. Kittur
Dafna Shahaf
42
19
0
19 Feb 2021
Symplectic Adjoint Method for Exact Gradient of Neural ODE with Minimal Memory
Takashi Matsubara
Yuto Miyatake
Takaharu Yaguchi
71
23
0
19 Feb 2021
WebRED: Effective Pretraining And Finetuning For Relation Extraction On The Web
Róbert Ormándi
Mohammad Saleh
Erin Winter
Vinay Rao
51
11
0
18 Feb 2021
MUDES: Multilingual Detection of Offensive Spans
Tharindu Ranasinghe
Marcos Zampieri
83
41
0
18 Feb 2021
A Systematic Review of Natural Language Processing Applied to Radiology Reports
Arlene Casey
Emma Davidson
Michael Poon
Hang Dong
Daniel Duma
...
Víctor Suárez-Paniagua
Richard Tobin
William Whiteley
Honghan Wu
Beatrice Alex
AI4CE
41
150
0
18 Feb 2021
Deep Learning for Suicide and Depression Identification with Unsupervised Label Correction
Ayaan Haque
V. Reddi
Tyler Giallanza
NoLa
62
60
0
18 Feb 2021
Meta-Transfer Learning for Low-Resource Abstractive Summarization
Yi-Syuan Chen
Hong-Han Shuai
CLL
OffRL
103
39
0
18 Feb 2021
Training Large-Scale News Recommenders with Pretrained Language Models in the Loop
Shitao Xiao
Zheng Liu
Yingxia Shao
Tao Di
Xing Xie
VLM
AIFin
186
42
0
18 Feb 2021
Composable Generative Models
Johan Leduc
Nicolas Grislain
SyDa
84
4
0
18 Feb 2021
Less is More: Pre-train a Strong Text Encoder for Dense Retrieval Using a Weak Decoder
Shuqi Lu
Di He
Chenyan Xiong
Guolin Ke
Waleed Malik
Zhicheng Dou
Paul N. Bennett
Tie-Yan Liu
Arnold Overwijk
RALM
124
11
0
18 Feb 2021
Entity-level Factual Consistency of Abstractive Text Summarization
Feng Nan
Ramesh Nallapati
Zhiguo Wang
Cicero Nogueira dos Santos
Henghui Zhu
Dejiao Zhang
Kathleen McKeown
Bing Xiang
HILM
205
161
0
18 Feb 2021
Quiz-Style Question Generation for News Stories
Á. Lelkes
Vinh Q. Tran
Cong Yu
84
42
0
18 Feb 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
573
1,143
0
17 Feb 2021
Towards generalisable hate speech detection: a review on obstacles and solutions
Wenjie Yin
A. Zubiaga
189
169
0
17 Feb 2021
SciDr at SDU-2020: IDEAS -- Identifying and Disambiguating Everyday Acronyms for Scientific Domain
Aadarsh Singh
Priyanshu Kumar
54
9
0
17 Feb 2021
Decoding EEG Brain Activity for Multi-Modal Natural Language Processing
Nora Hollenstein
Cédric Renggli
B. Glaus
Maria Barrett
M. Troendle
N. Langer
Ce Zhang
137
35
0
17 Feb 2021
Open-Retrieval Conversational Machine Reading
Yifan Gao
Jingjing Li
Chien-Sheng Wu
Michael R. Lyu
Irwin King
124
17
0
17 Feb 2021
First Target and Opinion then Polarity: Enhancing Target-opinion Correlation for Aspect Sentiment Triplet Extraction
Lianzhe Huang
Peiyi Wang
Sujian Li
Tianyu Liu
Xiaodong Zhang
Zhicong Cheng
D. Yin
Houfeng Wang
238
28
0
17 Feb 2021
TCN: Table Convolutional Network for Web Table Interpretation
Daheng Wang
Prashant Shiralkar
Colin Lockard
Binxuan Huang
Xin Luna Dong
Meng Jiang
LMTD
69
55
0
17 Feb 2021
Transferability of Neural Network Clinical De-identification Systems
Kahyun Lee
Nicholas J. Dobbins
Bridget T. McInnes
Meliha Yetisgen
Özlem Uzuner
OOD
61
5
0
17 Feb 2021
A Context-Enhanced De-identification System
Kahyun Lee
M. Kayaalp
Sam Henry
Özlem Uzuner
68
3
0
17 Feb 2021
Highly Fast Text Segmentation With Pairwise Markov Chains
E. Azeraf
E. Monfrini
Emmanuel Vignon
W. Pieczynski
56
5
0
17 Feb 2021
COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
Yu Meng
Chenyan Xiong
Payal Bajaj
Saurabh Tiwary
Paul N. Bennett
Jiawei Han
Xia Song
184
206
0
16 Feb 2021
IntSGD: Adaptive Floatless Compression of Stochastic Gradients
Konstantin Mishchenko
Bokun Wang
D. Kovalev
Peter Richtárik
107
15
0
16 Feb 2021
Conversations Gone Alright: Quantifying and Predicting Prosocial Outcomes in Online Conversations
Jiajun Bao
J. Wu
Yiming Zhang
Eshwar Chandrasekharan
David Jurgens
117
49
0
16 Feb 2021
Boosting Low-Resource Biomedical QA via Entity-Aware Masking Strategies
Gabriele Pergola
E. Kochkina
Lin Gui
Maria Liakata
Yulan He
145
32
0
16 Feb 2021
NoiseQA: Challenge Set Evaluation for User-Centric Question Answering
Abhilasha Ravichander
Siddharth Dalmia
Maria Ryskina
Florian Metze
Eduard H. Hovy
A. Black
ELM
59
32
0
16 Feb 2021
Dataset Condensation with Differentiable Siamese Augmentation
Bo Zhao
Hakan Bilen
DD
295
305
0
16 Feb 2021
Exploring Transformers in Natural Language Generation: GPT, BERT, and XLNet
M. O. Topal
Anil Bas
Imke van Heerden
LLMAG
AI4CE
73
91
0
16 Feb 2021
Improving speech recognition models with small samples for air traffic control systems
Yi Lin
Qin Li
Bo Yang
Zhen Yan
Huachun Tan
Zhengmao Chen
104
32
0
16 Feb 2021
TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models
Zhuohan Li
Siyuan Zhuang
Shiyuan Guo
Danyang Zhuo
Hao Zhang
Basel Alomair
Ion Stoica
MoE
104
125
0
16 Feb 2021
FEWS: Large-Scale, Low-Shot Word Sense Disambiguation with the Dictionary
Terra Blevins
Mandar Joshi
Luke Zettlemoyer
90
21
0
16 Feb 2021
Training Larger Networks for Deep Reinforcement Learning
Keita Ota
Devesh K. Jha
Asako Kanezaki
OffRL
97
40
0
16 Feb 2021
Few-Shot Graph Learning for Molecular Property Prediction
Zhichun Guo
Chuxu Zhang
Wenhao Yu
John E. Herr
Olaf Wiest
Meng Jiang
Nitesh Chawla
AI4CE
177
176
0
16 Feb 2021
Within-Document Event Coreference with BERT-Based Contextualized Representations
Shafiuddin Rehan Ahmed
James H. Martin
20
0
0
15 Feb 2021
Previous
1
2
3
...
359
360
361
...
471
472
473
Next