Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04805
Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
50 / 19,767 papers shown
Title
Learning Multi-Sense Word Distributions using Approximate Kullback-Leibler Divergence
P. Jayashree
Ballijepalli Shreya
P. K. Srijith
26
2
0
12 Nov 2019
A Syntax-aware Multi-task Learning Framework for Chinese Semantic Role Labeling
Qingrong Xia
Zhenghua Li
Min Zhang
33
17
0
12 Nov 2019
Understanding BERT performance in propaganda analysis
Yiqing Hua
30
16
0
11 Nov 2019
Attending to Entities for Better Text Understanding
Pengxiang Cheng
K. Erk
LRM
24
37
0
11 Nov 2019
Deep Contextualized Self-training for Low Resource Dependency Parsing
Guy Rotman
Roi Reichart
43
50
0
11 Nov 2019
A hybrid text normalization system using multi-head self-attention for mandarin
Junhui Zhang
Junjie Pan
Xiang Yin
Chen Li
Shichao Liu
Yang Zhang
Yuxuan Wang
Zejun Ma
AI4CE
21
15
0
11 Nov 2019
TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection
Siddhant Garg
Thuy Vu
Alessandro Moschitti
41
214
0
11 Nov 2019
Improving BERT Fine-tuning with Embedding Normalization
Wenxuan Zhou
Junyi Du
Xiang Ren
21
6
0
10 Nov 2019
Can Monolingual Pretrained Models Help Cross-Lingual Classification?
Zewen Chi
Li Dong
Furu Wei
Xian-Ling Mao
Heyan Huang
LRM
VLM
30
13
0
10 Nov 2019
Effectiveness of self-supervised pre-training for speech recognition
Alexei Baevski
Michael Auli
Abdel-rahman Mohamed
SSL
32
147
0
10 Nov 2019
Efficient Dialogue State Tracking by Selectively Overwriting Memory
Sungdong Kim
Sohee Yang
Gyuwan Kim
Sang-Woo Lee
39
195
0
10 Nov 2019
Two-Headed Monster And Crossed Co-Attention Networks
Yaoyiran Li
Jing Jiang
32
0
0
10 Nov 2019
A Bilingual Generative Transformer for Semantic Sentence Embedding
John Wieting
Graham Neubig
Taylor Berg-Kirkpatrick
43
28
0
10 Nov 2019
CamemBERT: a Tasty French Language Model
Louis Martin
Benjamin Muller
Pedro Ortiz Suarez
Yoann Dupont
Laurent Romary
Eric Villemonte de la Clergerie
Djamé Seddah
Benoît Sagot
56
960
0
10 Nov 2019
Pre-train and Plug-in: Flexible Conditional Text Generation with Variational Auto-Encoders
Yu Duan
Canwen Xu
Jiaxin Pei
Jialong Han
Chenliang Li
35
42
0
10 Nov 2019
Dynamic Neuro-Symbolic Knowledge Graph Construction for Zero-shot Commonsense Question Answering
Antoine Bosselut
Ronan Le Bras
Yejin Choi
NAI
35
41
0
10 Nov 2019
Rethinking Self-Attention: Towards Interpretability in Neural Parsing
Khalil Mrini
Franck Dernoncourt
Quan Tran
Trung Bui
W. Chang
Ndapandula Nakashole
MILM
LRM
23
29
0
10 Nov 2019
Knowledge Guided Named Entity Recognition for BioMedical Text
Pratyay Banerjee
Kuntal Kumar Pal
M. Devarakonda
Chitta Baral
29
0
0
10 Nov 2019
Improving Transformer Models by Reordering their Sublayers
Ofir Press
Noah A. Smith
Omer Levy
27
87
0
10 Nov 2019
Learning to Few-Shot Learn Across Diverse Natural Language Classification Tasks
Trapit Bansal
Rishikesh Jha
Andrew McCallum
SSL
21
118
0
10 Nov 2019
r/Fakeddit: A New Multimodal Benchmark Dataset for Fine-grained Fake News Detection
Kai Nakamura
Sharon Levy
Wenjie Wang
24
122
0
10 Nov 2019
Syntax-Infused Transformer and BERT models for Machine Translation and Natural Language Understanding
Dhanasekar Sundararaman
Vivek Subramanian
Guoyin Wang
Shijing Si
Dinghan Shen
Dong Wang
Lawrence Carin
19
40
0
10 Nov 2019
Not All Claims are Created Equal: Choosing the Right Statistical Approach to Assess Hypotheses
Erfan Sadeqi Azer
Daniel Khashabi
Ashish Sabharwal
Dan Roth
35
17
0
10 Nov 2019
CCAligned: A Massive Collection of Cross-Lingual Web-Document Pairs
Ahmed El-Kishky
Vishrav Chaudhary
Francisco Guzman
Philipp Koehn
34
199
0
10 Nov 2019
Generalizing Natural Language Analysis through Span-relation Representations
Zhengbao Jiang
Wenyuan Xu
Jun Araki
Graham Neubig
38
60
0
10 Nov 2019
Scalable Zero-shot Entity Linking with Dense Entity Retrieval
Ledell Yu Wu
Fabio Petroni
Martin Josifoski
Sebastian Riedel
Luke Zettlemoyer
57
182
0
10 Nov 2019
Meta Label Correction for Noisy Label Learning
Guoqing Zheng
Ahmed Hassan Awadallah
S. Dumais
NoLa
OffRL
27
179
0
10 Nov 2019
The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents
Kurt Shuster
Da Ju
Stephen Roller
Emily Dinan
Y-Lan Boureau
Jason Weston
37
81
0
09 Nov 2019
Multi-Sentence Argument Linking
Seth Ebner
Patrick Xia
Ryan Culkin
Kyle Rawlins
Benjamin Van Durme
HAI
34
159
0
09 Nov 2019
Sentence Meta-Embeddings for Unsupervised Semantic Textual Similarity
Nina Poerner
Ulli Waltinger
Hinrich Schütze
AI4TS
32
20
0
09 Nov 2019
ConveRT: Efficient and Accurate Conversational Representations from Transformers
Matthew Henderson
I. Casanueva
Nikola Mrkvsić
Pei-hao Su
Tsung-Hsien
Ivan Vulić
47
196
0
09 Nov 2019
E-BERT: Efficient-Yet-Effective Entity Embeddings for BERT
Nina Poerner
Ulli Waltinger
Hinrich Schütze
47
157
0
09 Nov 2019
Improving Machine Reading Comprehension via Adversarial Training
Ziqing Yang
Yiming Cui
Wanxiang Che
Ting Liu
Shijin Wang
Guoping Hu
32
17
0
09 Nov 2019
How Decoding Strategies Affect the Verifiability of Generated Text
Luca Massarelli
Fabio Petroni
Aleksandra Piktus
Myle Ott
Tim Rocktaschel
Vassilis Plachouras
Fabrizio Silvestri
Sebastian Riedel
33
50
0
09 Nov 2019
On the Relationship between Self-Attention and Convolutional Layers
Jean-Baptiste Cordonnier
Andreas Loukas
Martin Jaggi
45
529
0
08 Nov 2019
DZip: improved general-purpose lossless compression based on novel neural network modeling
Mohit Goyal
Kedar Tatwawadi
Shubham Chandak
Idoia Ochoa
AI4CE
21
23
0
08 Nov 2019
Negated and Misprimed Probes for Pretrained Language Models: Birds Can Talk, But Cannot Fly
Nora Kassner
Hinrich Schütze
28
318
0
08 Nov 2019
How Language-Neutral is Multilingual BERT?
Jindrich Libovický
Rudolf Rosa
Alexander Fraser
24
115
0
08 Nov 2019
Not Enough Data? Deep Learning to the Rescue!
Ateret Anaby-Tavor
Boaz Carmeli
Esther Goldbraich
Amir Kantor
George Kour
Segev Shlomov
N. Tepper
Naama Zwerdling
32
368
0
08 Nov 2019
Pretrained Language Models for Document-Level Neural Machine Translation
Liangyou Li
Xin Jiang
Qun Liu
36
19
0
08 Nov 2019
Relation Adversarial Network for Low Resource Knowledge Graph Completion
Ningyu Zhang
Shumin Deng
Zhanlin Sun
Jiaoayan Chen
Wei Zhang
Huajun Chen
39
71
0
08 Nov 2019
What Would Elsa Do? Freezing Layers During Transformer Fine-Tuning
Jaejun Lee
Raphael Tang
Jimmy J. Lin
34
121
0
08 Nov 2019
Reducing Sentiment Bias in Language Models via Counterfactual Evaluation
Po-Sen Huang
Huan Zhang
Ray Jiang
Robert Stanforth
Johannes Welbl
Jack W. Rae
Vishal Maini
Dani Yogatama
Pushmeet Kohli
36
209
0
08 Nov 2019
Contrastive Multi-document Question Generation
W. Cho
Yizhe Zhang
Sudha Rao
Asli Celikyilmaz
Chenyan Xiong
Jianfeng Gao
Mengdi Wang
Bill Dolan
SyDa
32
28
0
08 Nov 2019
Certified Data Removal from Machine Learning Models
Chuan Guo
Tom Goldstein
Awni Y. Hannun
Laurens van der Maaten
MU
57
424
0
08 Nov 2019
Why Do Masked Neural Language Models Still Need Common Sense Knowledge?
Sunjae Kwon
Cheongwoong Kang
Jiyeon Han
Jaesik Choi
24
16
0
08 Nov 2019
The TechQA Dataset
Vittorio Castelli
Rishav Chakravarti
Saswati Dana
Anthony Ferritto
Radu Florian
...
Andrzej Sakrajda
Avirup Sil
Rosario A. Uceda-Sosa
T. Ward
Rong Zhang
34
45
0
08 Nov 2019
Blockwise Self-Attention for Long Document Understanding
J. Qiu
Hao Ma
Omer Levy
Scott Yih
Sinong Wang
Jie Tang
24
252
0
07 Nov 2019
Probing Contextualized Sentence Representations with Visual Awareness
Zhuosheng Zhang
Rui Wang
Kehai Chen
Masao Utiyama
Eiichiro Sumita
Hai Zhao
29
2
0
07 Nov 2019
Contextualized Sparse Representations for Real-Time Open-Domain Question Answering
Jinhyuk Lee
Minjoon Seo
Hannaneh Hajishirzi
Jaewoo Kang
RALM
LRM
29
31
0
07 Nov 2019
Previous
1
2
3
...
377
378
379
...
394
395
396
Next