Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04805
Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
50 / 18,005 papers shown
Title
Stochastic Shared Embeddings: Data-driven Regularization of Embedding Layers
Liwei Wu
Shuqing Li
Cho-Jui Hsieh
James Sharpnack
21
31
0
25 May 2019
Human vs. Muppet: A Conservative Estimate of Human Performance on the GLUE Benchmark
Nikita Nangia
Samuel R. Bowman
ELM
ALM
34
75
0
24 May 2019
Discrete Flows: Invertible Generative Models of Discrete Data
Dustin Tran
Keyon Vafa
Kumar Krishna Agrawal
Laurent Dinh
Ben Poole
DRL
24
114
0
24 May 2019
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
Christopher Clark
Kenton Lee
Ming-Wei Chang
Tom Kwiatkowski
Michael Collins
Kristina Toutanova
96
1,413
0
24 May 2019
Personalizing Dialogue Agents via Meta-Learning
Zhaojiang Lin
Andrea Madotto
Chien-Sheng Wu
Pascale Fung
58
180
0
24 May 2019
Zero-shot Knowledge Transfer via Adversarial Belief Matching
P. Micaelli
Amos Storkey
19
228
0
23 May 2019
Misspelling Oblivious Word Embeddings
Bora Edizel
Aleksandra Piktus
Piotr Bojanowski
Rui A. Ferreira
Edouard Grave
Fabrizio Silvestri
12
63
0
23 May 2019
Deep Q-Learning with Q-Matrix Transfer Learning for Novel Fire Evacuation Environment
Jivitesh Sharma
Per-Arne Andersen
Ole-Christoffer Granmo
M. G. Olsen
AI4CE
35
68
0
23 May 2019
An Investigation of Transfer Learning-Based Sentiment Analysis in Japanese
Enkhbold Bataa
Joshua Wu
21
33
0
23 May 2019
Data-Efficient Image Recognition with Contrastive Predictive Coding
Olivier J. Hénaff
A. Srinivas
J. Fauw
Ali Razavi
Carl Doersch
S. M. Ali Eslami
Aaron van den Oord
SSL
58
1,417
0
22 May 2019
AMR Parsing as Sequence-to-Graph Transduction
Sheng Zhang
Xutai Ma
Kevin Duh
Benjamin Van Durme
33
148
0
21 May 2019
Answering while Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction
Kosuke Nishida
Kyosuke Nishida
Masaaki Nagata
Atsushi Otsuka
Itsumi Saito
Hisako Asano
J. Tomita
RALM
19
102
0
21 May 2019
Interpretable Neural Predictions with Differentiable Binary Variables
Jasmijn Bastings
Wilker Aziz
Ivan Titov
32
211
0
20 May 2019
Human-like machine thinking: Language guided imagination
Feng Qi
Wenchuan Wu
AI4CE
MLLM
16
5
0
18 May 2019
Story Ending Prediction by Transferable BERT
Zhongyang Li
Xiao Ding
Ting Liu
34
52
0
17 May 2019
Adaptation of Deep Bidirectional Multilingual Transformers for Russian Language
Yuri Kuratov
M. Arkhipov
11
274
0
17 May 2019
Gmail Smart Compose: Real-Time Assisted Writing
Mengzhao Chen
Benjamin Lee
G. Bansal
Yuan Cao
Shuyuan Zhang
...
Yinan Wang
Andrew M. Dai
Zhehuai Chen
Timothy Sohn
Yonghui Wu
21
203
0
17 May 2019
An Information Theoretic Interpretation to Deep Neural Networks
Shao-Lun Huang
Xiangxiang Xu
Lizhong Zheng
G. Wornell
FAtt
22
41
0
16 May 2019
HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization
Xingxing Zhang
Furu Wei
M. Zhou
37
377
0
16 May 2019
What do you learn from context? Probing for sentence structure in contextualized word representations
Ian Tenney
Patrick Xia
Berlin Chen
Alex Jinpeng Wang
Adam Poliak
...
Najoung Kim
Benjamin Van Durme
Samuel R. Bowman
Dipanjan Das
Ellie Pavlick
91
848
0
15 May 2019
A Surprisingly Robust Trick for Winograd Schema Challenge
Vid Kocijan
Ana-Maria Cretu
Oana-Maria Camburu
Yordan Yordanov
Thomas Lukasiewicz
26
101
0
15 May 2019
BERT Rediscovers the Classical NLP Pipeline
Ian Tenney
Dipanjan Das
Ellie Pavlick
MILM
SSeg
50
1,441
0
15 May 2019
Sense Vocabulary Compression through the Semantic Knowledge of WordNet for Neural Word Sense Disambiguation
Loïc Vial
Benjamin Lecouteux
D. Schwab
16
90
0
14 May 2019
Style Transformer: Unpaired Text Style Transfer without Disentangled Latent Representation
Ning Dai
Jianze Liang
Xipeng Qiu
Xuanjing Huang
DRL
16
202
0
14 May 2019
Entity-Relation Extraction as Multi-Turn Question Answering
Xiaoya Li
Fan Yin
Zijun Sun
Xiayu Li
Arianna Yuan
Duo Chai
Mingxin Zhou
Jiwei Li
33
346
0
14 May 2019
PatentBERT: Patent Classification with Fine-Tuning a pre-trained BERT Model
Jieh-Sheng Lee
J. Hsiang
11
93
0
14 May 2019
A Review of Keyphrase Extraction
Eirini Papagiannopoulou
Grigorios Tsoumakas
21
166
0
13 May 2019
Almost Unsupervised Text to Speech and Automatic Speech Recognition
Yi Ren
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
44
101
0
13 May 2019
Synchronous Bidirectional Neural Machine Translation
Long Zhou
Jiajun Zhang
Chengqing Zong
22
106
0
13 May 2019
A logical-based corpus for cross-lingual evaluation
Felipe Salvatore
Marcelo Finger
R. Hirata
21
1
0
10 May 2019
Deep Unsupervised Cardinality Estimation
Zongheng Yang
Eric Liang
Amog Kamsetty
Chenggang Wu
Yan Duan
Peter Chen
Pieter Abbeel
J. M. Hellerstein
S. Krishnan
Ion Stoica
27
203
0
10 May 2019
Language Modeling with Deep Transformers
Kazuki Irie
Albert Zeyer
Ralf Schluter
Hermann Ney
KELM
43
171
0
10 May 2019
Improving Discrete Latent Representations With Differentiable Approximation Bridges
Jason Ramapuram
Russ Webb
DRL
19
9
0
09 May 2019
Unified Language Model Pre-training for Natural Language Understanding and Generation
Li Dong
Nan Yang
Wenhui Wang
Furu Wei
Xiaodong Liu
Yu-Chiang Frank Wang
Jianfeng Gao
M. Zhou
H. Hon
ELM
AI4CE
80
1,551
0
08 May 2019
Show, Price and Negotiate: A Negotiator with Online Value Look-Ahead
Amin Parvaneh
Ehsan Abbasnejad
Qi Wu
Javen Qinfeng Shi
Anton van den Hengel
OffRL
29
5
0
07 May 2019
Neural Architecture Refinement: A Practical Way for Avoiding Overfitting in NAS
Yangzhou Jiang
Cong Zhao
Zeyang Dou
Lei Pang
14
5
0
07 May 2019
Taming Pretrained Transformers for Extreme Multi-label Text Classification
Wei-Cheng Chang
Hsiang-Fu Yu
Kai Zhong
Yiming Yang
Inderjit Dhillon
25
20
0
07 May 2019
Investigating the Successes and Failures of BERT for Passage Re-Ranking
Harshith Padigela
Hamed Zamani
W. Bruce Croft
19
47
0
05 May 2019
Learning to Denoise Distantly-Labeled Data for Entity Typing
Yasumasa Onoe
Greg Durrett
27
57
0
04 May 2019
ASER: A Large-scale Eventuality Knowledge Graph
Hongming Zhang
Xin Liu
Haojie Pan
Yangqiu Song
C. Leung
SLR
27
159
0
01 May 2019
Deep Learning for Audio Signal Processing
Hendrik Purwins
Bo-wen Li
Tuomas Virtanen
Jan Schlüter
Shuo-yiin Chang
Tara N. Sainath
VLM
26
586
0
30 Apr 2019
Very Deep Self-Attention Networks for End-to-End Speech Recognition
Ngoc-Quan Pham
T. Nguyen
Jan Niehues
Markus Müller
Sebastian Stüker
A. Waibel
25
161
0
30 Apr 2019
Segmentation is All You Need
Zehua Cheng
Yuxiang Wu
Zhenghua Xu
Thomas Lukasiewicz
Weiyan Wang
33
20
0
30 Apr 2019
Towards Efficient Model Compression via Learned Global Ranking
Ting-Wu Chin
Ruizhou Ding
Cha Zhang
Diana Marculescu
16
170
0
28 Apr 2019
Improved Conditional VRNNs for Video Prediction
Lluis Castrejon
Nicolas Ballas
Aaron Courville
VGen
DRL
23
161
0
27 Apr 2019
TVQA+: Spatio-Temporal Grounding for Video Question Answering
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
31
227
0
25 Apr 2019
Probing What Different NLP Tasks Teach Machines about Function Word Comprehension
Najoung Kim
Roma Patel
Adam Poliak
Alex Jinpeng Wang
Patrick Xia
...
Alexis Ross
Tal Linzen
Benjamin Van Durme
Samuel R. Bowman
Ellie Pavlick
28
106
0
25 Apr 2019
Tetra-Tagging: Word-Synchronous Parsing with Linear-Time Inference
Nikita Kitaev
Dan Klein
38
20
0
22 Apr 2019
Understanding Roles and Entities: Datasets and Models for Natural Language Inference
Arindam Mitra
Ishan Shrivastava
Chitta Baral
28
2
0
22 Apr 2019
Poly-encoders: Transformer Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring
Samuel Humeau
Kurt Shuster
Marie-Anne Lachaux
Jason Weston
30
280
0
22 Apr 2019
Previous
1
2
3
...
356
357
358
359
360
361
Next