Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04805
Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
50 / 18,001 papers shown
Title
Analyzing the Structure of Attention in a Transformer Language Model
Jesse Vig
Yonatan Belinkov
30
357
0
07 Jun 2019
From Caesar Cipher to Unsupervised Learning: A New Method for Classifier Parameter Estimation
Yu Liu
Li Deng
Jianshu Chen
C. Chen
SSL
26
0
0
06 Jun 2019
Conversing by Reading: Contentful Neural Conversation with On-demand Machine Reading
Lianhui Qin
Michel Galley
Chris Brockett
Xiaodong Liu
Xiang Gao
W. Dolan
Yejin Choi
Jianfeng Gao
26
109
0
06 Jun 2019
Visualizing and Measuring the Geometry of BERT
Andy Coenen
Emily Reif
Ann Yuan
Been Kim
Adam Pearce
F. Viégas
Martin Wattenberg
MILM
43
415
0
06 Jun 2019
Cross-Lingual Syntactic Transfer through Unsupervised Adaptation of Invertible Projections
Junxian He
Zhisong Zhang
Taylor Berg-Kirkpatrick
Graham Neubig
33
21
0
06 Jun 2019
Unsupervised Pivot Translation for Distant Languages
Yichong Leng
Xu Tan
Tao Qin
Xiang-Yang Li
Tie-Yan Liu
33
30
0
06 Jun 2019
Extracting Symptoms and their Status from Clinical Conversations
Nan Du
Kai Chen
Anjuli Kannan
Linh Tran
Yuhui Chen
Izhak Shafran
20
68
0
05 Jun 2019
Large-Scale Multi-Label Text Classification on EU Legislation
Ilias Chalkidis
Manos Fergadiotis
Prodromos Malakasiotis
Ion Androutsopoulos
AILaw
19
213
0
05 Jun 2019
From Balustrades to Pierre Vinken: Looking for Syntax in Transformer Self-Attentions
David Marecek
Rudolf Rosa
28
52
0
05 Jun 2019
The Secrets of Machine Learning: Ten Things You Wish You Had Known Earlier to be More Effective at Data Analysis
Cynthia Rudin
David Carlson
HAI
30
34
0
04 Jun 2019
KERMIT: Generative Insertion-Based Modeling for Sequences
William Chan
Nikita Kitaev
Kelvin Guu
Mitchell Stern
Jakob Uszkoreit
VLM
23
65
0
04 Jun 2019
Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation
Benjamin Heinzerling
Michael Strube
13
35
0
04 Jun 2019
How multilingual is Multilingual BERT?
Telmo Pires
Eva Schlinger
Dan Garrette
LRM
VLM
95
1,373
0
04 Jun 2019
Converse Attention Knowledge Transfer for Low-Resource Named Entity Recognition
Shengfei Lyu
Linghao Sun
Huixiong Yi
Yong-jin Liu
Huanhuan Chen
Steven C. H. Hoi
21
0
0
04 Jun 2019
Detecting Local Insights from Global Labels: Supervised & Zero-Shot Sequence Labeling via a Convolutional Decomposition
A. Schmaltz
27
8
0
04 Jun 2019
Episodic Memory in Lifelong Language Learning
Cyprien de Masson dÁutume
Sebastian Ruder
Lingpeng Kong
Dani Yogatama
CLL
KELM
34
281
0
03 Jun 2019
Learning Representations by Maximizing Mutual Information Across Views
Philip Bachman
R. Devon Hjelm
William Buchwalter
SSL
96
1,457
0
03 Jun 2019
Masked Non-Autoregressive Image Captioning
Junlong Gao
Xi Meng
Shiqi Wang
Xia Li
Shanshe Wang
Siwei Ma
Wen Gao
19
36
0
03 Jun 2019
BAYHENN: Combining Bayesian Deep Learning and Homomorphic Encryption for Secure DNN Inference
Peichen Xie
Bingzhe Wu
Guangyu Sun
BDL
FedML
13
33
0
03 Jun 2019
Efficient 8-Bit Quantization of Transformer Neural Machine Language Translation Model
Aishwarya Bhandare
Vamsi Sripathi
Deepthi Karkada
Vivek V. Menon
Sun Choi
Kushal Datta
V. Saletore
MQ
27
130
0
03 Jun 2019
A Survey of Natural Language Generation Techniques with a Focus on Dialogue Systems - Past, Present and Future Directions
Sashank Santhanam
Samira Shaikh
3DV
31
52
0
02 Jun 2019
Pretraining Methods for Dialog Context Representation Learning
Shikib Mehri
E. Razumovskaia
Tiancheng Zhao
M. Eskénazi
22
84
0
02 Jun 2019
Adversarial Generation and Encoding of Nested Texts
A. Rozental
GAN
19
0
0
01 Jun 2019
Scoring Sentence Singletons and Pairs for Abstractive Summarization
Logan Lebanoff
Kaiqiang Song
Franck Dernoncourt
Doo Soon Kim
Seokhwan Kim
W. Chang
Fei Liu
CVBM
30
103
0
31 May 2019
Do Human Rationales Improve Machine Explanations?
Julia Strout
Ye Zhang
Raymond J. Mooney
19
57
0
31 May 2019
Investigating an Effective Character-level Embedding in Korean Sentence Classification
Won Ik Cho
Seokhwan Kim
N. Kim
28
8
0
31 May 2019
MultiQA: An Empirical Investigation of Generalization and Transfer in Reading Comprehension
Alon Talmor
Jonathan Berant
20
172
0
31 May 2019
Fine-Grained Spoiler Detection from Large-Scale Review Corpora
Mengting Wan
Rishabh Misra
Ndapandula Nakashole
Julian McAuley
9
130
0
31 May 2019
Rewarding Smatch: Transition-Based AMR Parsing with Reinforcement Learning
Tahira Naseem
Abhishek Shah
Hui Wan
Radu Florian
Salim Roukos
Miguel Ballesteros
25
59
0
31 May 2019
A Lightweight Recurrent Network for Sequence Modeling
Biao Zhang
Rico Sennrich
27
7
0
30 May 2019
Unbabel's Submission to the WMT2019 APE Shared Task: BERT-based Encoder-Decoder for Automatic Post-Editing
António Vilarinho Lopes
M. Amin Farajian
Gonçalo M. Correia
Jonay Trénous
André F. T. Martins
33
35
0
30 May 2019
Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention
Wenhu Chen
Jianshu Chen
Pengda Qin
Xifeng Yan
William Yang Wang
28
129
0
30 May 2019
A Simple but Effective Method to Incorporate Multi-turn Context with BERT for Conversational Machine Comprehension
Yasuhito Ohsugi
Itsumi Saito
Kyosuke Nishida
Hisako Asano
J. Tomita
33
43
0
30 May 2019
A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models
Elman Mansimov
Alex Jinpeng Wang
Sean Welleck
Kyunghyun Cho
AIMat
28
46
0
29 May 2019
Unsupervised Paraphrasing without Translation
Aurko Roy
David Grangier
BDL
LRM
11
61
0
29 May 2019
Adapting Text Embeddings for Causal Inference
Victor Veitch
Dhanya Sridhar
David M. Blei
CML
17
21
0
29 May 2019
Defending Against Neural Fake News
Rowan Zellers
Ari Holtzman
Hannah Rashkin
Yonatan Bisk
Ali Farhadi
Franziska Roesner
Yejin Choi
AAML
55
1,000
0
29 May 2019
Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain)
Mariya Toneva
Leila Wehbe
MILM
AI4CE
42
220
0
28 May 2019
Combating Adversarial Misspellings with Robust Word Recognition
Danish Pruthi
Bhuwan Dhingra
Zachary Chase Lipton
25
300
0
27 May 2019
STAR-GCN: Stacked and Reconstructed Graph Convolutional Networks for Recommender Systems
Jiani Zhang
Xingjian Shi
Shenglin Zhao
Irwin King
29
225
0
27 May 2019
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
Guillaume Jaume
H. K. Ekenel
Jean-Philippe Thiran
143
357
0
27 May 2019
Levenshtein Transformer
Jiatao Gu
Changhan Wang
Jake Zhao
49
359
0
27 May 2019
AI-GAs: AI-generating algorithms, an alternate paradigm for producing general artificial intelligence
Jeff Clune
17
116
0
27 May 2019
Where's My Head? Definition, Dataset and Models for Numeric Fused-Heads Identification and Resolution
Yanai Elazar
Yoav Goldberg
19
23
0
26 May 2019
TIGS: An Inference Algorithm for Text Infilling with Gradient Search
Dayiheng Liu
Jie Fu
Pengfei Liu
Jiancheng Lv
DiffM
21
27
0
26 May 2019
Hashing based Answer Selection
Dong Xu
Wu-Jun Li
19
6
0
26 May 2019
Stochastic Shared Embeddings: Data-driven Regularization of Embedding Layers
Liwei Wu
Shuqing Li
Cho-Jui Hsieh
James Sharpnack
21
31
0
25 May 2019
Human vs. Muppet: A Conservative Estimate of Human Performance on the GLUE Benchmark
Nikita Nangia
Samuel R. Bowman
ELM
ALM
34
75
0
24 May 2019
Discrete Flows: Invertible Generative Models of Discrete Data
Dustin Tran
Keyon Vafa
Kumar Krishna Agrawal
Laurent Dinh
Ben Poole
DRL
24
114
0
24 May 2019
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
Christopher Clark
Kenton Lee
Ming-Wei Chang
Tom Kwiatkowski
Michael Collins
Kristina Toutanova
96
1,413
0
24 May 2019
Previous
1
2
3
...
355
356
357
...
359
360
361
Next