Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04805
Cited By
v1
v2 (latest)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
50 / 23,688 papers shown
Title
Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment
Zewen Chi
Li Dong
Bo Zheng
Shaohan Huang
Xian-Ling Mao
Heyan Huang
Furu Wei
121
70
0
11 Jun 2021
Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution
Fanchao Qi
Yuan Yao
Sophia Xu
Zhiyuan Liu
Maosong Sun
SILM
77
132
0
11 Jun 2021
Dynamic Language Models for Continuously Evolving Content
Spurthi Amba Hombaiah
Tao Chen
Mingyang Zhang
Michael Bendersky
Marc Najork
CLL
KELM
103
38
0
11 Jun 2021
Going Beyond Linear Transformers with Recurrent Fast Weight Programmers
Kazuki Irie
Imanol Schlag
Róbert Csordás
Jürgen Schmidhuber
118
64
0
11 Jun 2021
EPICURE Ensemble Pretrained Models for Extracting Cancer Mutations from Literature
Jiarun Cao
E. M. Veen
Niels Peek
A. Renehan
Sophia Ananiadou
15
11
0
11 Jun 2021
AugNet: End-to-End Unsupervised Visual Representation Learning with Image Augmentation
Mingxiang Chen
Z. Chang
Hao Lu
Bitao Yang
Zhuang Li
Liufang Guo
Zhecheng Wang
SSL
27
10
0
11 Jun 2021
FedNLP: An interpretable NLP System to Decode Federal Reserve Communications
Jean Lee
Hoyoul Luis Youn
Nicholas Stevens
Josiah Poon
S. Han
55
10
0
11 Jun 2021
Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis with Graph-based Multi-modal Context Modeling
Jingbei Li
Yi Meng
Chenyi Li
Zhiyong Wu
Helen Meng
Chao Weng
Jane Polak Scowcroft
93
24
0
11 Jun 2021
Sprachsynthese -- State-of-the-Art in englischer und deutscher Sprache
René Peinl
50
0
0
11 Jun 2021
CONDA: a CONtextual Dual-Annotated dataset for in-game toxicity understanding and detection
H. Weld
Guanghao Huang
Jean Lee
Tongshu Zhang
Kunze Wang
Xinghong Guo
Siqu Long
Josiah Poon
S. Han
97
18
0
11 Jun 2021
MlTr: Multi-label Classification with Transformer
Xingyi Cheng
Hezheng Lin
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Nian Shi
Honglin Liu
ViT
53
50
0
11 Jun 2021
BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data
Haoyu Song
Yan Wang
Kaiyan Zhang
Weinan Zhang
Ting Liu
65
123
0
11 Jun 2021
Assessing Political Prudence of Open-domain Chatbots
Yejin Bang
Nayeon Lee
Etsuko Ishii
Andrea Madotto
Pascale Fung
70
25
0
11 Jun 2021
Global Neighbor Sampling for Mixed CPU-GPU Training on Giant Graphs
Jialin Dong
Da Zheng
Lin F. Yang
Geroge Karypis
GNN
54
37
0
11 Jun 2021
Monotonic Neural Network: combining Deep Learning with Domain Knowledge for Chiller Plants Energy Optimization
Fanhe Ma
Faen Zhang
Shenglan Ben
Shuxin Qin
Pengcheng Zhou
Changsheng Zhou
Fengyi Xu
62
0
0
11 Jun 2021
DORO: Distributional and Outlier Robust Optimization
Runtian Zhai
Chen Dan
J. Zico Kolter
Pradeep Ravikumar
55
62
0
11 Jun 2021
A comprehensive solution to retrieval-based chatbot construction
Kristen Moore
Shenjun Zhong
Zhen He
Torsten Rudolf
Nils Fisher
Brandon Victor
Neha Jindal
26
13
0
11 Jun 2021
ChemRL-GEM: Geometry Enhanced Molecular Representation Learning for Property Prediction
Xiaomin Fang
Lihang Liu
Jieqiong Lei
Donglong He
Shanzhuo Zhang
Jingbo Zhou
Fan Wang
Hua Wu
Haifeng Wang
AI4CE
91
462
0
11 Jun 2021
Bridging Subword Gaps in Pretrain-Finetune Paradigm for Natural Language Generation
Xin Liu
Baosong Yang
Dayiheng Liu
Haibo Zhang
Weihua Luo
Min Zhang
Haiying Zhang
Jinsong Su
63
18
0
11 Jun 2021
Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models
Matthew Finlayson
Aaron Mueller
Sebastian Gehrmann
Stuart M. Shieber
Tal Linzen
Yonatan Belinkov
141
110
0
10 Jun 2021
One Sense per Translation
B. Hauer
Grzegorz Kondrak
73
1
0
10 Jun 2021
Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning
Liangqiong Qu
Yuyin Zhou
Paul Pu Liang
Yingda Xia
Feifei Wang
Ehsan Adeli
L. Fei-Fei
D. Rubin
FedML
AI4CE
114
186
0
10 Jun 2021
Cross-lingual Emotion Detection
Sabit Hassan
Shaden Shaar
Kareem Darwish
53
12
0
10 Jun 2021
CodemixedNLP: An Extensible and Open NLP Toolkit for Code-Mixing
Sai Muralidhar Jayanthi
Kavya Nerella
Khyathi Chandu
A. Black
MoE
73
8
0
10 Jun 2021
Space-time Mixing Attention for Video Transformer
Adrian Bulat
Juan-Manuel Perez-Rua
Swathikiran Sudhakaran
Brais Martínez
Georgios Tzimiropoulos
ViT
95
127
0
10 Jun 2021
Scaling Vision with Sparse Mixture of Experts
C. Riquelme
J. Puigcerver
Basil Mustafa
Maxim Neumann
Rodolphe Jenatton
André Susano Pinto
Daniel Keysers
N. Houlsby
MoE
154
613
0
10 Jun 2021
Synthesizing Adversarial Negative Responses for Robust Response Ranking and Evaluation
Prakhar Gupta
Yulia Tsvetkov
Jeffrey P. Bigham
86
23
0
10 Jun 2021
A Semi-supervised Multi-task Learning Approach to Classify Customer Contact Intents
Li Dong
Matthew C. Spencer
Amir Biagi
54
3
0
10 Jun 2021
A Template-guided Hybrid Pointer Network for Knowledge-basedTask-oriented Dialogue Systems
Dingmin Wang
Ziyao Chen
Wanwei He
Li Zhong
Yunzhe Tao
Min Yang
92
11
0
10 Jun 2021
Neural Text Classification and Stacked Heterogeneous Embeddings for Named Entity Recognition in SMM4H 2021
Usama Yaseen
Stefan Langer
41
13
0
10 Jun 2021
GroupBERT: Enhanced Transformer Architecture with Efficient Grouped Structures
Ivan Chelombiev
Daniel Justus
Douglas Orr
A. Dietrich
Frithjof Gressmann
A. Koliousis
Carlo Luschi
60
5
0
10 Jun 2021
CAT: Cross Attention in Vision Transformer
Hezheng Lin
Xingyi Cheng
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Qing Song
Wei Yuan
ViT
67
158
0
10 Jun 2021
Programming Puzzles
Tal Schuster
Ashwin Kalyan
Oleksandr Polozov
Adam Tauman Kalai
ELM
110
34
0
10 Jun 2021
Linguistically Informed Masking for Representation Learning in the Patent Domain
Sophia Althammer
Mark Buckley
Sebastian Hofstatter
Allan Hanbury
59
11
0
10 Jun 2021
SemEval-2021 Task 11: NLPContributionGraph -- Structuring Scholarly NLP Contributions for a Research Knowledge Graph
Jennifer D'Souza
Sören Auer
Ted Pedersen
94
32
0
10 Jun 2021
FEVEROUS: Fact Extraction and VERification Over Unstructured and Structured information
Rami Aly
Zhijiang Guo
Michael Schlichtkrull
James Thorne
Andreas Vlachos
Christos Christodoulopoulos
O. Cocarascu
Arpit Mittal
HILM
135
188
0
10 Jun 2021
AI-enabled Automation for Completeness Checking of Privacy Policies
Orlando Amaral
Sallam Abualhaija
Damiano Torre
M. Sabetzadeh
Lionel C. Briand
57
40
0
10 Jun 2021
Ruddit: Norms of Offensiveness for English Reddit Comments
Rishav Hada
S. Sudhir
Pushkar Mishra
H. Yannakoudakis
Saif M. Mohammad
Ekaterina Shutova
137
37
0
10 Jun 2021
MST: Masked Self-Supervised Transformer for Visual Representation
Zhaowen Li
Zhiyang Chen
Fan Yang
Wei Li
Yousong Zhu
...
Rui Deng
Liwei Wu
Rui Zhao
Ming Tang
Jinqiao Wang
ViT
102
168
0
10 Jun 2021
MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training
Mingliang Zeng
Xu Tan
Rui Wang
Zeqian Ju
Tao Qin
Tie-Yan Liu
70
136
0
10 Jun 2021
VT-SSum: A Benchmark Dataset for Video Transcript Segmentation and Summarization
Tengchao Lv
Lei Cui
M. Vasilijevic
Furu Wei
63
7
0
10 Jun 2021
Supervising the Transfer of Reasoning Patterns in VQA
Corentin Kervadec
Christian Wolf
G. Antipov
M. Baccouche
Madiha Nadri Wolf
79
11
0
10 Jun 2021
AUGNLG: Few-shot Natural Language Generation using Self-trained Data Augmentation
Xinnuo Xu
Guoyin Wang
Young-Bum Kim
Sungjin Lee
70
32
0
10 Jun 2021
Shades of BLEU, Flavours of Success: The Case of MultiWOZ
Tomás Nekvinda
Ondrej Dusek
79
59
0
10 Jun 2021
How Robust are Model Rankings: A Leaderboard Customization Approach for Equitable Evaluation
Swaroop Mishra
Anjana Arunkumar
89
26
0
10 Jun 2021
Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models
Tyler A. Chang
Yifan Xu
Weijian Xu
Zhuowen Tu
ViT
59
15
0
10 Jun 2021
Semantic-aware Binary Code Representation with BERT
Hyungjoon Koo
Soyeon Park
Daejin Choi
Taesoo Kim
64
24
0
10 Jun 2021
Variational Information Bottleneck for Effective Low-Resource Fine-Tuning
Rabeeh Karimi Mahabadi
Yonatan Belinkov
James Henderson
DRL
76
76
0
10 Jun 2021
Low-Dimensional Structure in the Space of Language Representations is Reflected in Brain Responses
Richard Antonello
Javier S. Turek
Vy A. Vo
Alexander G. Huth
83
43
0
09 Jun 2021
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Mandela Patrick
Dylan Campbell
Yuki M. Asano
Ishan Misra
Ishan Misra Florian Metze
Christoph Feichtenhofer
Andrea Vedaldi
João F. Henriques
114
282
0
09 Jun 2021
Previous
1
2
3
...
326
327
328
...
472
473
474
Next