Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04805
Cited By
v1
v2 (latest)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
50 / 23,670 papers shown
Title
DESCGEN: A Distantly Supervised Dataset for Generating Abstractive Entity Descriptions
Weijia Shi
Mandar Joshi
Luke Zettlemoyer
3DV
62
3
0
09 Jun 2021
End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering
Devendra Singh Sachan
Siva Reddy
William L. Hamilton
Chris Dyer
Dani Yogatama
OOD
RALM
101
170
0
09 Jun 2021
Grover's Algorithm for Question Answering
Adriana D. Correia
M. Moortgat
H. Stoof
87
5
0
09 Jun 2021
Generative Models as a Data Source for Multiview Representation Learning
Ali Jahanian
Xavier Puig
Yonglong Tian
Phillip Isola
101
129
0
09 Jun 2021
URLTran: Improving Phishing URL Detection Using Transformers
Pranav Maneriker
Jack W. Stokes
Edir Garcia Lazo
Diana Carutasu
Farid Tajaddodianfar
A. Gururajan
60
64
0
09 Jun 2021
Bayesian Attention Belief Networks
Shujian Zhang
Xinjie Fan
Bo Chen
Mingyuan Zhou
BDL
114
32
0
09 Jun 2021
XBNet : An Extremely Boosted Neural Network
Tushar Sarkar
LMTD
31
22
0
09 Jun 2021
Do Transformers Really Perform Bad for Graph Representation?
Chengxuan Ying
Tianle Cai
Shengjie Luo
Shuxin Zheng
Guolin Ke
Di He
Yanming Shen
Tie-Yan Liu
GNN
117
445
0
09 Jun 2021
Key Information Extraction From Documents: Evaluation And Generator
Oliver Bensch
Mirela C. Popa
Constantin Spille
42
14
0
09 Jun 2021
Pretrained Encoders are All You Need
Mina Khan
P. Srivatsa
Advait Rane
Shriram Chenniappa
Rishabh Anand
Sherjil Ozair
Pattie Maes
SSL
VLM
83
6
0
09 Jun 2021
Order-Agnostic Cross Entropy for Non-Autoregressive Machine Translation
Cunxiao Du
Zhaopeng Tu
Jing Jiang
83
88
0
09 Jun 2021
Salient Object Ranking with Position-Preserved Attention
Haoyang Fang
Daoxin Zhang
Yi Zhang
Minghao Chen
Jiawei Li
Yao Hu
Deng Cai
Xiaofei He
71
21
0
09 Jun 2021
Crosslingual Embeddings are Essential in UNMT for Distant Languages: An English to IndoAryan Case Study
Tamali Banerjee
V. Rudra Murthy
P. Bhattacharyya
81
9
0
09 Jun 2021
Psycholinguistic Tripartite Graph Network for Personality Detection
Tao Yang
Feifan Yang
Haolan Ouyang
Xiaojun Quan
65
23
0
09 Jun 2021
Phraseformer: Multimodal Key-phrase Extraction using Transformer and Graph Embedding
Narjes Nikzad Khasmakhi
M. Feizi-Derakhshi
M. Asgari-Chenaghlu
M. Balafar
Ali Reza Feizi Derakhshi
Taymaz Rahkar-Farshi
Majid Ramezani
Zoleikha Jahanbakhsh-Nagadeh
E. Zafarani-Moattar
Mehrdad Ranjbar-Khadivi
60
23
0
09 Jun 2021
Neural Supervised Domain Adaptation by Augmenting Pre-trained Models with Random Units
Sara Meftah
N. Semmar
Y. Tamaazousti
H. Essafi
F. Sadat
67
3
0
09 Jun 2021
Reliable Adversarial Distillation with Unreliable Teachers
Jianing Zhu
Jiangchao Yao
Bo Han
Jingfeng Zhang
Tongliang Liu
Gang Niu
Jingren Zhou
Jianliang Xu
Hongxia Yang
AAML
104
66
0
09 Jun 2021
Self-supervision of Feature Transformation for Further Improving Supervised Learning
Zilin Ding
Yuhang Yang
Xuan Cheng
Xiaomin Wang
Ming-Yuan Liu
SSL
34
2
0
09 Jun 2021
Automatic Sexism Detection with Multilingual Transformer Models
Mina Schütz
Jaqueline Boeck
Daria Liakhovets
D. Slijepcevic
Armin Kirchknopf
Manuel Hecht
Johannes Bogensperger
S. Schlarb
Alexander Schindler
Matthias Zeppelzauer
40
29
0
09 Jun 2021
Joint System-Wise Optimization for Pipeline Goal-Oriented Dialog System
Zichuan Lin
Jing Huang
Bowen Zhou
Xiaodong He
Tengyu Ma
OffRL
43
3
0
09 Jun 2021
Probing Multilingual Language Models for Discourse
Murathan Kurfali
Robert Östling
76
17
0
09 Jun 2021
Catchphrase: Automatic Detection of Cultural References
Nir Sweed
Dafna Shahaf
55
4
0
09 Jun 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
157
1,223
0
09 Jun 2021
Pretraining Representations for Data-Efficient Reinforcement Learning
Max Schwarzer
Nitarshan Rajkumar
Michael Noukhovitch
Ankesh Anand
Laurent Charlin
Devon Hjelm
Philip Bachman
Aaron Courville
OffRL
115
118
0
09 Jun 2021
On Sample Based Explanation Methods for NLP:Efficiency, Faithfulness, and Semantic Evaluation
Wei Zhang
Ziming Huang
Yada Zhu
Guangnan Ye
Xiaodong Cui
Fan Zhang
123
18
0
09 Jun 2021
Self-Supervised Graph Learning with Hyperbolic Embedding for Temporal Health Event Prediction
Chang Lu
Chandan K. Reddy
Yue Ning
62
31
0
09 Jun 2021
Investigating sanity checks for saliency maps with image and text classification
Narine Kokhlikyan
Vivek Miglani
B. Alsallakh
Miguel Martin
Orion Reblitz-Richardson
FAtt
23
13
0
08 Jun 2021
Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
Rabeeh Karimi Mahabadi
James Henderson
Sebastian Ruder
MoE
157
495
0
08 Jun 2021
VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation
Linjie Li
Jie Lei
Zhe Gan
Licheng Yu
Yen-Chun Chen
...
Tamara L. Berg
Joey Tianyi Zhou
Jingjing Liu
Lijuan Wang
Zicheng Liu
VLM
123
103
0
08 Jun 2021
On the Lack of Robust Interpretability of Neural Text Classifiers
Muhammad Bilal Zafar
Michele Donini
Dylan Slack
Cédric Archambeau
Sanjiv Ranjan Das
K. Kenthapadi
AAML
70
21
0
08 Jun 2021
Self-Supervised Learning with Data Augmentations Provably Isolates Content from Style
Julius von Kügelgen
Yash Sharma
Luigi Gresele
Wieland Brendel
Bernhard Schölkopf
M. Besserve
Francesco Locatello
141
317
0
08 Jun 2021
Neural Extractive Search
Shauli Ravfogel
Hillel Taub-Tabib
Yoav Goldberg
43
3
0
08 Jun 2021
TIMEDIAL: Temporal Commonsense Reasoning in Dialog
Lianhui Qin
Aditya Gupta
Shyam Upadhyay
Luheng He
Yejin Choi
Manaal Faruqui
LRM
102
72
0
08 Jun 2021
BERT Learns to Teach: Knowledge Distillation with Meta Learning
Wangchunshu Zhou
Canwen Xu
Julian McAuley
134
87
0
08 Jun 2021
XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation
Subhabrata Mukherjee
Ahmed Hassan Awadallah
Jianfeng Gao
59
22
0
08 Jun 2021
Scaling Vision Transformers
Xiaohua Zhai
Alexander Kolesnikov
N. Houlsby
Lucas Beyer
ViT
178
1,099
0
08 Jun 2021
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
211
1,150
0
08 Jun 2021
Learning from Multiple Noisy Partial Labelers
Peilin Yu
Tiffany Ding
Stephen H. Bach
NoLa
71
22
0
08 Jun 2021
Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
Rabeeh Karimi Mahabadi
Sebastian Ruder
Mostafa Dehghani
James Henderson
MoE
86
314
0
08 Jun 2021
Muddling Label Regularization: Deep Learning for Tabular Datasets
Karim Lounici
Katia Méziani
Benjamin Riu
87
6
0
08 Jun 2021
Back2Future: Leveraging Backfill Dynamics for Improving Real-time Predictions in Future
Harshavardhan Kamarthi
Alexander Rodríguez
B. Prakash
AI4TS
94
14
0
08 Jun 2021
SynthRef: Generation of Synthetic Referring Expressions for Object Segmentation
Ioannis V. Kazakos
Carles Ventura
Míriam Bellver
Carina Silberer
Xavier Giró-i-Nieto
DiffM
31
2
0
08 Jun 2021
Using a New Nonlinear Gradient Method for Solving Large Scale Convex Optimization Problems with an Application on Arabic Medical Text
Jaafar Hammoud
A. Eisa
N. Dobrenko
N. Gusarova
25
1
0
08 Jun 2021
Speech BERT Embedding For Improving Prosody in Neural TTS
Liping Chen
Yan Deng
Xi Wang
Frank Soong
Lei He
92
23
0
08 Jun 2021
A Unified Generative Framework for Aspect-Based Sentiment Analysis
Hang Yan
Junqi Dai
Tuo Ji
Xipeng Qiu
Zheng Zhang
89
285
0
08 Jun 2021
Staircase Attention for Recurrent Processing of Sequences
Da Ju
Stephen Roller
Sainbayar Sukhbaatar
Jason Weston
87
11
0
08 Jun 2021
Giving Commands to a Self-Driving Car: How to Deal with Uncertain Situations?
Thierry Deruyttere
Victor Milewski
Marie-Francine Moens
72
15
0
08 Jun 2021
Dynamic Sparse Training for Deep Reinforcement Learning
Ghada Sokar
Elena Mocanu
Decebal Constantin Mocanu
Mykola Pechenizkiy
Peter Stone
111
60
0
08 Jun 2021
Incorporating NODE with Pre-trained Neural Differential Operator for Learning Dynamics
Shiqi Gong
Qi Meng
Yue Wang
Lijun Wu
Wei Chen
Zhi-Ming Ma
Tie-Yan Liu
59
4
0
08 Jun 2021
PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning
Tao Yu
Cuiling Lan
Wenjun Zeng
Mingxiao Feng
Zhizheng Zhang
Zhibo Chen
OffRL
101
46
0
08 Jun 2021
Previous
1
2
3
...
327
328
329
...
472
473
474
Next