Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04805
Cited By
v1
v2 (latest)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
50 / 23,555 papers shown
Title
LakhNES: Improving multi-instrumental music generation with cross-domain pre-training
Chris Donahue
H. H. Mao
Yiting Li
G. Cottrell
Julian McAuley
94
119
0
10 Jul 2019
Sparse Networks from Scratch: Faster Training without Losing Performance
Tim Dettmers
Luke Zettlemoyer
159
342
0
10 Jul 2019
BAM! Born-Again Multi-Task Networks for Natural Language Understanding
Kevin Clark
Minh-Thang Luong
Urvashi Khandelwal
Christopher D. Manning
Quoc V. Le
101
230
0
10 Jul 2019
Large Memory Layers with Product Keys
Guillaume Lample
Alexandre Sablayrolles
MarcÁurelio Ranzato
Ludovic Denoyer
Hervé Jégou
MoE
93
135
0
10 Jul 2019
GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing
Jian Guo
He He
Tong He
Leonard Lausen
Mu Li
...
Hang Zhang
Zhi-Li Zhang
Zhongyue Zhang
Shuai Zheng
Yi Zhu
VLM
BDL
119
198
0
09 Jul 2019
Transfer Learning from Audio-Visual Grounding to Speech Recognition
Wei-Ning Hsu
David Harwath
James R. Glass
SSL
65
32
0
09 Jul 2019
Cross-Domain Generalization of Neural Constituency Parsers
Daniel Fried
Nikita Kitaev
Dan Klein
NAI
AI4CE
83
37
0
09 Jul 2019
Positional Normalization
Boyi Li
Felix Wu
Kilian Q. Weinberger
Serge J. Belongie
78
92
0
09 Jul 2019
To Tune or Not To Tune? How About the Best of Both Worlds?
Ran A. Wang
Haibo Su
Chunye Wang
Kailin Ji
J. Ding
VLM
85
17
0
09 Jul 2019
Incorporating Query Term Independence Assumption for Efficient Retrieval and Ranking using Deep Neural Networks
Bhaskar Mitra
Corby Rosset
D. Hawking
Nick Craswell
Fernando Diaz
Emine Yilmaz
70
30
0
08 Jul 2019
Searching for Effective Neural Extractive Summarization: What Works and What's Next
Ming Zhong
Pengfei Liu
Danqing Wang
Xipeng Qiu
Xuanjing Huang
AI4TS
179
151
0
08 Jul 2019
Infer Implicit Contexts in Real-time Online-to-Offline Recommendation
Xichen Ding
Jie Tang
T. Liu
Cheng Xu
Yaping Zhang
Feng Shi
Qixia Jiang
Dan Shen
OffRL
CML
31
15
0
08 Jul 2019
Attending to Emotional Narratives
Zhengxuan Wu
Xiyu Zhang
Zhi-Xuan Tan
Jamil Zaki
Desmond C. Ong
AI4TS
57
18
0
08 Jul 2019
Improving Neural Relation Extraction with Implicit Mutual Relations
Jun Kuang
Yixin Cao
Jianbing Zheng
Xiangnan He
Ming Gao
Aoying Zhou
AI4TS
65
22
0
08 Jul 2019
Improving short text classification through global augmentation methods
Vukosi Marivate
T. Sefara
VLM
67
96
0
07 Jul 2019
Neural Aspect and Opinion Term Extraction with Mined Rules as Weak Supervision
Hongliang Dai
Yangqiu Song
80
108
0
07 Jul 2019
Graph based Neural Networks for Event Factuality Prediction using Syntactic and Semantic Structures
Amir Pouran Ben Veyseh
Thien Huu Nguyen
Dejing Dou
94
45
0
07 Jul 2019
BERT-DST: Scalable End-to-End Dialogue State Tracking with Bidirectional Encoder Representations from Transformer
Guan-Lin Chao
Ian Lane
90
103
0
05 Jul 2019
Graph Representation Learning via Hard and Channel-Wise Attention Networks
Hongyang Gao
Shuiwang Ji
GNN
70
57
0
05 Jul 2019
Invariant Risk Minimization
Martín Arjovsky
Léon Bottou
Ishaan Gulrajani
David Lopez-Paz
OOD
292
2,252
0
05 Jul 2019
Multi-lingual Intent Detection and Slot Filling in a Joint BERT-based Model
Giuseppe Castellucci
Valentina Bellomaria
Andrea Favalli
Raniero Romagnoli
VLM
61
75
0
05 Jul 2019
Head-Driven Phrase Structure Grammar Parsing on Penn Treebank
Junru Zhou
Zhao Hai
127
144
0
05 Jul 2019
Improving Chemical Named Entity Recognition in Patents with Contextualized Word Embeddings
Zenan Zhai
Dat Quoc Nguyen
S. Akhondi
Camilo Thorne
Christian Druckenbrodt
Trevor Cohn
M. Gregory
Karin Verspoor
63
42
0
05 Jul 2019
Transfer Learning for Risk Classification of Social Media Posts: Model Evaluation Study
Derek Howard
M. Maslej
Justin Lee
Jacob Ritchie
G. Woollard
L. French
AI4MH
50
30
0
04 Jul 2019
Multi-Task Learning for Coherence Modeling
Youmna Farag
H. Yannakoudakis
59
26
0
04 Jul 2019
Graph-based Knowledge Distillation by Multi-head Attention Network
Seunghyun Lee
B. Song
92
77
0
04 Jul 2019
An External Knowledge Enhanced Multi-label Charge Prediction Approach with Label Number Learning
Duan Wei
Li Lin
AILaw
110
5
0
04 Jul 2019
Depth Growing for Neural Machine Translation
Lijun Wu
Yiren Wang
Yingce Xia
Fei Tian
Fei Gao
Tao Qin
Jianhuang Lai
Tie-Yan Liu
68
41
0
03 Jul 2019
Encoding high-cardinality string categorical variables
Patricio Cerda
Gaël Varoquaux
87
91
0
03 Jul 2019
Augmenting Self-attention with Persistent Memory
Sainbayar Sukhbaatar
Edouard Grave
Guillaume Lample
Hervé Jégou
Armand Joulin
RALM
KELM
77
139
0
02 Jul 2019
How we do things with words: Analyzing text as social and cultural data
D. Nguyen
Maria Liakata
Simon DeDeo
Jacob Eisenstein
David M. Mimno
Rebekah Tromble
J. Winters
71
88
0
02 Jul 2019
Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems
Hung Le
Doyen Sahoo
Nancy F. Chen
Guosheng Lin
70
112
0
02 Jul 2019
Neural Machine Reading Comprehension: Methods and Trends
Kaixuan Li
Xiujuan Xian
Sheng Zhang
Jiafu Wang
N. Yu
FaML
79
13
0
02 Jul 2019
Neural Semantic Parsing with Anonymization for Command Understanding in General-Purpose Service Robots
Nick Walker
Yu-Tang Peng
Maya Cakmak
70
15
0
02 Jul 2019
Claim Extraction in Biomedical Publications using Deep Discourse Model and Transfer Learning
Titipat Achakulvisut
Chandra Bhagavatula
Daniel Ernesto Acuna
Konrad Paul Kording
45
38
0
01 Jul 2019
Katecheo: A Portable and Modular System for Multi-Topic Question Answering
S. Hirekodi
Seban Sunny
Leonard Topno
Alwin Daniel
Daniel Whitenack
Reuben Skewes
Stuart Cranney
KELM
31
1
0
01 Jul 2019
Learning World Graphs to Accelerate Hierarchical Reinforcement Learning
Wenling Shang
Alexander R. Trott
Stephan Zheng
Caiming Xiong
R. Socher
92
18
0
01 Jul 2019
Do Transformer Attention Heads Provide Transparency in Abstractive Summarization?
Joris Baan
Maartje ter Hoeve
M. V. D. Wees
Anne Schuth
Maarten de Rijke
85
21
0
01 Jul 2019
Few-Shot Representation Learning for Out-Of-Vocabulary Words
Ziniu Hu
Ting-Li Chen
Kai-Wei Chang
Yizhou Sun
91
77
0
01 Jul 2019
Patent Claim Generation by Fine-Tuning OpenAI GPT-2
Jieh-Sheng Lee
J. Hsiang
142
151
0
01 Jul 2019
The University of Sydney's Machine Translation System for WMT19
Liang Ding
Dacheng Tao
57
13
0
30 Jun 2019
ICDAR 2019 Competition on Scene Text Visual Question Answering
Ali Furkan Biten
Rubèn Pérez Tito
Andrés Mafla
Lluís Gómez
Marçal Rusiñol
Minesh Mathew
C. V. Jawahar
Ernest Valveny
Dimosthenis Karatzas
76
76
0
30 Jun 2019
BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition
Shaoshi Ling
Julian Salazar
Yuzong Liu
Katrin Kirchhoff
SSL
93
28
0
30 Jun 2019
Self-Supervised Dialogue Learning
Jiawei Wu
Xin Eric Wang
William Yang Wang
SSL
73
58
0
30 Jun 2019
Machine Reading Comprehension: a Literature Review
Xin Zhang
An Yang
Sujian Li
Yizhong Wang
91
33
0
30 Jun 2019
Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting
Shiyang Li
Xiaoyong Jin
Yao Xuan
Xiyou Zhou
Wenhu Chen
Yu Wang
Xifeng Yan
AI4TS
207
1,452
0
29 Jun 2019
Deep Gamblers: Learning to Abstain with Portfolio Theory
Liu Ziyin
Zhikang T. Wang
Paul Pu Liang
Ruslan Salakhutdinov
Louis-Philippe Morency
Masahito Ueda
116
114
0
29 Jun 2019
GPT-based Generation for Classical Chinese Poetry
Yi-Lun Liao
Yasheng Wang
Qun Liu
Xin Jiang
79
40
0
29 Jun 2019
Relating Simple Sentence Representations in Deep Neural Networks and the Brain
Sharmistha Jat
Hao Tang
Partha P. Talukdar
Tom Michael Mitchell
70
22
0
27 Jun 2019
Compositional Semantic Parsing Across Graphbanks
Matthias Lindemann
Jonas Groschwitz
Alexander Koller
GNN
70
53
0
27 Jun 2019
Previous
1
2
3
...
458
459
460
...
470
471
472
Next