Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04805
Cited By
v1
v2 (latest)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
50 / 23,688 papers shown
Title
Dual-view Molecule Pre-training
Jinhua Zhu
Yingce Xia
Tao Qin
Wen-gang Zhou
Houqiang Li
Tie-Yan Liu
AI4CE
116
52
0
17 Jun 2021
Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning
Colin Wei
Sang Michael Xie
Tengyu Ma
156
100
0
17 Jun 2021
Long-Short Temporal Contrastive Learning of Video Transformers
Jue Wang
Gedas Bertasius
Du Tran
Lorenzo Torresani
VLM
ViT
153
50
0
17 Jun 2021
Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI
Geng Yuan
Zhiheng Liao
Xiaolong Ma
Yuxuan Cai
Zhenglun Kong
...
Hongwu Peng
Ning Liu
Ao Ren
Jinhui Wang
Yanzhi Wang
AAML
108
33
0
16 Jun 2021
A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection
Jie Jessie Ren
Stanislav Fort
J. Liu
Abhijit Guha Roy
Shreyas Padhy
Balaji Lakshminarayanan
UQCV
200
227
0
16 Jun 2021
QuantumFed: A Federated Learning Framework for Collaborative Quantum Training
Qi Xia
Qun Li
FedML
113
31
0
16 Jun 2021
Specializing Multilingual Language Models: An Empirical Study
Ethan C. Chau
Noah A. Smith
139
27
0
16 Jun 2021
Disentangling Online Chats with DAG-Structured LSTMs
D. Pappadopulo
Lisa Bauer
M. Farina
Ozan Irsoy
Joey Tianyi Zhou
66
5
0
16 Jun 2021
Invertible Attention
Jiajun Zha
Yiran Zhong
Jing Zhang
Leonid Sigal
Liang Zheng
82
7
0
16 Jun 2021
Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better
Gaurav Menghani
VLM
MedIm
120
391
0
16 Jun 2021
Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data
Haoming Jiang
Danqing Zhang
Tianyu Cao
Bing Yin
T. Zhao
NoLa
82
46
0
16 Jun 2021
A Fair and Comprehensive Comparison of Multimodal Tweet Sentiment Analysis Methods
Gullal Singh Cheema
Sherzod Hakimov
Eric Müller-Budack
Ralph Ewerth
68
20
0
16 Jun 2021
Eigen Analysis of Self-Attention and its Reconstruction from Partial Computation
Srinadh Bhojanapalli
Ayan Chakrabarti
Himanshu Jain
Sanjiv Kumar
Michal Lukasik
Andreas Veit
70
8
0
16 Jun 2021
To Raise or Not To Raise: The Autonomous Learning Rate Question
Xiaomeng Dong
Tao Tan
Michael Potter
Yun-Chan Tsai
Gaurav Kumar
V. R. Saripalli
Theodore Trafalis
OOD
50
2
0
16 Jun 2021
Input Invex Neural Network
Suman Sapkota
Binod Bhattarai
50
4
0
16 Jun 2021
Memorization and Generalization in Neural Code Intelligence Models
Md Rafiqul Islam Rabin
Aftab Hussain
Mohammad Amin Alipour
Vincent J. Hellendoorn
TDI
87
43
0
16 Jun 2021
On the proper role of linguistically-oriented deep net analysis in linguistic theorizing
Marco Baroni
135
53
0
16 Jun 2021
Alzheimer's Disease Detection from Spontaneous Speech through Combining Linguistic Complexity and (Dis)Fluency Features with Pretrained Language Models
Yu Qiao
Xuefeng Yin
Daniel Wiechmann
E. Kerz
75
23
0
16 Jun 2021
Eider: Empowering Document-level Relation Extraction with Efficient Evidence Extraction and Inference-stage Fusion
Yiqing Xie
Jiaming Shen
Sha Li
Yuning Mao
Jiawei Han
87
69
0
16 Jun 2021
Semantic sentence similarity: size does not always matter
Danny Merkx
S. Frank
M. Ernestus
46
6
0
16 Jun 2021
From Discourse to Narrative: Knowledge Projection for Event Relation Extraction
Jialong Tang
Hongyu Lin
M. Liao
Yaojie Lu
Xianpei Han
Le Sun
Weijian Xie
Jin Xu
67
24
0
16 Jun 2021
Out-of-Scope Intent Detection with Self-Supervision and Discriminative Training
Li-Ming Zhan
Haowen Liang
Bo Liu
Lu Fan
Xiao-Ming Wu
Albert Y. S. Lam
OODD
63
80
0
16 Jun 2021
Coreference-Aware Dialogue Summarization
Zhengyuan Liu
Ke Shi
Nancy F. Chen
82
60
0
16 Jun 2021
Scene Transformer: A unified architecture for predicting multiple agent trajectories
Jiquan Ngiam
Benjamin Caine
Vijay Vasudevan
Zhengdong Zhang
H. Chiang
...
Ashish Venugopal
David J. Weiss
Benjamin Sapp
Zhifeng Chen
Jonathon Shlens
131
168
0
15 Jun 2021
Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors
Junayed Mahmud
Fahim Faisal
Raihan Islam Arnob
Antonios Anastasopoulos
Kevin Moran
127
20
0
15 Jun 2021
Adversarial Attacks on Deep Models for Financial Transaction Records
I. Fursov
Matvey Morozov
N. Kaploukhaya
Elizaveta Kovtun
Rodrigo Rivera-Castro
Gleb Gusev
Dmitrii Babaev
Ivan Kireev
Alexey Zaytsev
Evgeny Burnaev
AAML
85
38
0
15 Jun 2021
Development of Quantized DNN Library for Exact Hardware Emulation
M. Kiyama
Motoki Amagasaki
M. Iida
MQ
35
0
0
15 Jun 2021
First Place Solution of KDD Cup 2021 & OGB Large-Scale Challenge Graph Prediction Track
Chengxuan Ying
Mingqi Yang
Shuxin Zheng
Guolin Ke
Shengjie Luo
Tianle Cai
Chenglin Wu
Yuxin Wang
Yanming Shen
Di He
49
11
0
15 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
411
2,858
0
15 Jun 2021
Interpretable Self-supervised Multi-task Learning for COVID-19 Information Retrieval and Extraction
Nima Ebadi
Peyman Najafirad
41
0
0
15 Jun 2021
PairConnect: A Compute-Efficient MLP Alternative to Attention
Zhaozhuo Xu
Minghao Yan
Junyan Zhang
Anshumali Shrivastava
52
1
0
15 Jun 2021
Consistency Regularization for Cross-Lingual Fine-Tuning
Bo Zheng
Li Dong
Shaohan Huang
Wenhui Wang
Zewen Chi
Saksham Singhal
Wanxiang Che
Ting Liu
Xia Song
Furu Wei
62
58
0
15 Jun 2021
Question Answering Infused Pre-training of General-Purpose Contextualized Representations
Robin Jia
M. Lewis
Luke Zettlemoyer
82
29
0
15 Jun 2021
Direction is what you need: Improving Word Embedding Compression in Large Language Models
Klaudia Bałazy
Mohammadreza Banaei
R. Lebret
Jacek Tabor
Karl Aberer
55
7
0
15 Jun 2021
Evaluating Modules in Graph Contrastive Learning
Ganqu Cui
Y. Du
Cheng Yang
Jie Zhou
Liang Xu
Xing Zhou
Lifeng Wang
Zhiyuan Liu
59
4
0
15 Jun 2021
CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
Ningyu Zhang
Mosha Chen
Zhen Bi
Xiaozhuan Liang
Lei Li
...
Jun Yan
Hongying Zan
Kunli Zhang
Buzhou Tang
Qingcai Chen
LM&MA
ELM
93
193
0
15 Jun 2021
Multivariate Business Process Representation Learning utilizing Gramian Angular Fields and Convolutional Neural Networks
P. Pfeiffer
Johannes Lahann
Peter Fettke
SSL
66
17
0
15 Jun 2021
Zero-shot Node Classification with Decomposed Graph Prototype Network
Zheng Wang
Jialong Wang
Yuchen Guo
Zhiguo Gong
74
37
0
15 Jun 2021
Unsupervised Abstractive Opinion Summarization by Generating Sentences with Tree-Structured Topic Guidance
Masaru Isonuma
Junichiro Mori
Danushka Bollegala
Ichiro Sakata
61
27
0
15 Jun 2021
Simple GNN Regularisation for 3D Molecular Property Prediction & Beyond
Jonathan Godwin
Michael Schaarschmidt
Alex Gaunt
Alvaro Sanchez-Gonzalez
Yulia Rubanova
Petar Velivcković
J. Kirkpatrick
Peter W. Battaglia
109
60
0
15 Jun 2021
Incorporating Word Sense Disambiguation in Neural Language Models
Jan Philip Wahle
Terry Ruas
Norman Meuschke
Bela Gipp
67
11
0
15 Jun 2021
Deriving Word Vectors from Contextualized Language Models using Topic-Aware Mention Selection
Yixiao Wang
Zied Bouraoui
Luis Espinosa Anke
Steven Schockaert
46
4
0
15 Jun 2021
Modeling morphology with Linear Discriminative Learning: considerations and design choices
Maria Heitmeier
Yu-Ying Chuang
Raunern
86
20
0
15 Jun 2021
BERT Embeddings for Automatic Readability Assessment
Joseph Marvin Imperial
67
39
0
15 Jun 2021
An Automated Quality Evaluation Framework of Psychotherapy Conversations with Local Quality Estimates
Zhuohao Chen
Nikolaos Flemotomos
Karan Singla
Torrey A. Creed
David C. Atkins
Shrikanth Narayanan
68
5
0
15 Jun 2021
Vision-Language Navigation with Random Environmental Mixup
Chong Liu
Fengda Zhu
Xiaojun Chang
Xiaodan Liang
Zongyuan Ge
Yi-Dong Shen
LM&Ro
150
88
0
15 Jun 2021
Teacher-Student MixIT for Unsupervised and Semi-supervised Speech Separation
Jisi Zhang
Catalin Zorila
R. Doddipatla
Jon Barker
56
22
0
15 Jun 2021
Assessing the Use of Prosody in Constituency Parsing of Imperfect Transcripts
Trang Tran
Mari Ostendorf
53
5
0
14 Jun 2021
Can BERT Dig It? -- Named Entity Recognition for Information Retrieval in the Archaeology Domain
Alex Brandsen
Suzan Verberne
K. Lambers
M. Wansleeben
65
38
0
14 Jun 2021
Improving Paraphrase Detection with the Adversarial Paraphrasing Task
Animesh Nighojkar
John Licato
70
39
0
14 Jun 2021
Previous
1
2
3
...
324
325
326
...
472
473
474
Next