Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04805
Cited By
v1
v2 (latest)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
50 / 23,545 papers shown
Title
Cross-Lingual Syntactic Transfer through Unsupervised Adaptation of Invertible Projections
Junxian He
Zhisong Zhang
Taylor Berg-Kirkpatrick
Graham Neubig
75
21
0
06 Jun 2019
Generating Question-Answer Hierarchies
Kalpesh Krishna
Mohit Iyyer
83
39
0
06 Jun 2019
Unsupervised Pivot Translation for Distant Languages
Yichong Leng
Xu Tan
Tao Qin
Xiang-Yang Li
Tie-Yan Liu
51
30
0
06 Jun 2019
GCDT: A Global Context Enhanced Deep Transition Architecture for Sequence Labeling
Yanjun Liu
Fandong Meng
Jinchao Zhang
Jinan Xu
Jinan Xu
Jie Zhou
69
90
0
06 Jun 2019
Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification, and Local Computations
Debraj Basu
Deepesh Data
C. Karakuş
Suhas Diggavi
MQ
76
408
0
06 Jun 2019
Explain Yourself! Leveraging Language Models for Commonsense Reasoning
Nazneen Rajani
Bryan McCann
Caiming Xiong
R. Socher
ReLM
LRM
133
567
0
06 Jun 2019
Energy and Policy Considerations for Deep Learning in NLP
Emma Strubell
Ananya Ganesh
Andrew McCallum
88
2,668
0
05 Jun 2019
Variational Pretraining for Semi-supervised Text Classification
Suchin Gururangan
T. Dang
Dallas Card
Noah A. Smith
VLM
66
112
0
05 Jun 2019
Extracting Symptoms and their Status from Clinical Conversations
Nan Du
Kai Chen
Anjuli Kannan
Linh Tran
Yuhui Chen
Izhak Shafran
67
68
0
05 Jun 2019
Learning to Rank for Plausible Plausibility
Zhongyang Li
Tongfei Chen
Benjamin Van Durme
59
22
0
05 Jun 2019
Neural Legal Judgment Prediction in English
Ilias Chalkidis
Ion Androutsopoulos
Nikolaos Aletras
AILaw
ELM
190
342
0
05 Jun 2019
Large-Scale Multi-Label Text Classification on EU Legislation
Ilias Chalkidis
Manos Fergadiotis
Prodromos Malakasiotis
Ion Androutsopoulos
AILaw
64
217
0
05 Jun 2019
From Balustrades to Pierre Vinken: Looking for Syntax in Transformer Self-Attentions
David Marecek
Rudolf Rosa
59
52
0
05 Jun 2019
Baby steps towards few-shot learning with multiple semantics
Eli Schwartz
Leonid Karlinsky
Rogerio Feris
Raja Giryes
A. Bronstein
VLM
117
107
0
05 Jun 2019
Towards Multimodal Sarcasm Detection (An _Obviously_ Perfect Paper)
Santiago Castro
Devamanyu Hazarika
Verónica Pérez-Rosas
Roger Zimmermann
Rada Mihalcea
Soujanya Poria
76
259
0
05 Jun 2019
Learning Deep Transformer Models for Machine Translation
Qiang Wang
Bei Li
Tong Xiao
Jingbo Zhu
Changliang Li
Derek F. Wong
Lidia S. Chao
108
673
0
05 Jun 2019
Entity-Centric Contextual Affective Analysis
Anjalie Field
Yulia Tsvetkov
91
30
0
05 Jun 2019
The Unreasonable Effectiveness of Transformer Language Models in Grammatical Error Correction
Dimitrios Alikaniotis
Vipul Raheja
72
23
0
04 Jun 2019
Open Sesame: Getting Inside BERT's Linguistic Knowledge
Yongjie Lin
Y. Tan
Robert Frank
89
287
0
04 Jun 2019
Towards Lossless Encoding of Sentences
Gabriele Prato
Mathieu Duchesneau
A. Chandar
Alain Tapp
59
2
0
04 Jun 2019
The Secrets of Machine Learning: Ten Things You Wish You Had Known Earlier to be More Effective at Data Analysis
Cynthia Rudin
David Carlson
HAI
128
34
0
04 Jun 2019
KERMIT: Generative Insertion-Based Modeling for Sequences
William Chan
Nikita Kitaev
Kelvin Guu
Mitchell Stern
Jakob Uszkoreit
VLM
96
65
0
04 Jun 2019
Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation
Benjamin Heinzerling
Michael Strube
53
36
0
04 Jun 2019
Training Neural Response Selection for Task-Oriented Dialogue Systems
Matthew Henderson
Ivan Vulić
D. Gerz
I. Casanueva
Paweł Budzianowski
Sam Coope
Georgios P. Spithourakis
Tsung-Hsien Wen
N. Mrksic
Pei-hao Su
54
111
0
04 Jun 2019
Blackbox meets blackbox: Representational Similarity and Stability Analysis of Neural Language Models and Brains
Samira Abnar
Lisa Beinborn
Rochelle Choenni
Willem H. Zuidema
89
76
0
04 Jun 2019
How multilingual is Multilingual BERT?
Telmo Pires
Eva Schlinger
Dan Garrette
LRM
VLM
266
1,418
0
04 Jun 2019
A Cross-Sentence Latent Variable Model for Semi-Supervised Text Sequence Matching
Jihun Choi
Taeuk Kim
Sang-goo Lee
BDL
77
6
0
04 Jun 2019
Converse Attention Knowledge Transfer for Low-Resource Named Entity Recognition
Shengfei Lyu
Linghao Sun
Huixiong Yi
Yong Liu
Huanhuan Chen
Steven C. H. Hoi
80
0
0
04 Jun 2019
A Review of Automated Speech and Language Features for Assessment of Cognitive and Thought Disorders
Rohit Voleti
J. Liss
Visar Berisha
77
72
0
04 Jun 2019
Detecting Local Insights from Global Labels: Supervised & Zero-Shot Sequence Labeling via a Convolutional Decomposition
A. Schmaltz
60
8
0
04 Jun 2019
Episodic Memory in Lifelong Language Learning
Cyprien de Masson dÁutume
Sebastian Ruder
Lingpeng Kong
Dani Yogatama
CLL
KELM
150
293
0
03 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRL
OnRL
156
1,070
0
03 Jun 2019
Learning Representations by Maximizing Mutual Information Across Views
Philip Bachman
R. Devon Hjelm
William Buchwalter
SSL
230
1,482
0
03 Jun 2019
Gendered Ambiguous Pronouns Shared Task: Boosting Model Confidence by Evidence Pooling
Sandeep Attree
36
14
0
03 Jun 2019
Gender-preserving Debiasing for Pre-trained Word Embeddings
Masahiro Kaneko
Danushka Bollegala
FaML
72
131
0
03 Jun 2019
Masked Non-Autoregressive Image Captioning
Junlong Gao
Xi Meng
Shiqi Wang
Xia Li
Shanshe Wang
Siwei Ma
Wen Gao
80
39
0
03 Jun 2019
Resolving Gendered Ambiguous Pronouns with BERT
Kellie Webster
Marta Recasens
Ken Krige
Vera Axelrod
Denis Logvinenko
Jason Baldridge
107
54
0
03 Jun 2019
BAYHENN: Combining Bayesian Deep Learning and Homomorphic Encryption for Secure DNN Inference
Peichen Xie
Bingzhe Wu
Guangyu Sun
BDL
FedML
56
33
0
03 Jun 2019
Assessing the Ability of Self-Attention Networks to Learn Word Order
Baosong Yang
Longyue Wang
Derek F. Wong
Lidia S. Chao
Zhaopeng Tu
75
32
0
03 Jun 2019
Know More about Each Other: Evolving Dialogue Strategy via Compound Assessment
Siqi Bao
H. He
Fan Wang
Rongzhong Lian
Hua Wu
LLMAG
OffRL
74
18
0
03 Jun 2019
Efficient 8-Bit Quantization of Transformer Neural Machine Language Translation Model
Aishwarya Bhandare
Vamsi Sripathi
Deepthi Karkada
Vivek V. Menon
Sun Choi
Kushal Datta
V. Saletore
MQ
95
132
0
03 Jun 2019
A Survey of Natural Language Generation Techniques with a Focus on Dialogue Systems - Past, Present and Future Directions
Sashank Santhanam
Samira Shaikh
3DV
79
52
0
02 Jun 2019
Pretraining Methods for Dialog Context Representation Learning
Shikib Mehri
E. Razumovskaia
Tiancheng Zhao
M. Eskénazi
117
84
0
02 Jun 2019
Does It Make Sense? And Why? A Pilot Study for Sense Making and Explanation
Cunxiang Wang
Shuailong Liang
Yue Zhang
Xiaonan Li
Tian Gao
LRM
90
111
0
02 Jun 2019
Pre-training of Graph Augmented Transformers for Medication Recommendation
Junyuan Shang
Tengfei Ma
Cao Xiao
Jimeng Sun
94
291
0
02 Jun 2019
Latent Retrieval for Weakly Supervised Open Domain Question Answering
Kenton Lee
Ming-Wei Chang
Kristina Toutanova
RALM
123
1,020
0
01 Jun 2019
Multimodal Transformer for Unaligned Multimodal Language Sequences
Yao-Hung Hubert Tsai
Shaojie Bai
Paul Pu Liang
J. Zico Kolter
Louis-Philippe Morency
Ruslan Salakhutdinov
105
1,319
0
01 Jun 2019
Efficient Adaptation of Pretrained Transformers for Abstractive Summarization
Andrew Hoang
Antoine Bosselut
Asli Celikyilmaz
Yejin Choi
76
41
0
01 Jun 2019
Scoring Sentence Singletons and Pairs for Abstractive Summarization
Logan Lebanoff
Kaiqiang Song
Franck Dernoncourt
Doo Soon Kim
Seokhwan Kim
W. Chang
Fei Liu
CVBM
126
106
0
31 May 2019
Pre-Training Graph Neural Networks for Generic Structural Feature Extraction
Ziniu Hu
Changjun Fan
Ting-Li Chen
Kai-Wei Chang
Yizhou Sun
68
44
0
31 May 2019
Previous
1
2
3
...
461
462
463
...
469
470
471
Next