Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04805
Cited By
v1
v2 (latest)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
50 / 23,688 papers shown
Title
OadTR: Online Action Detection with Transformers
Xiang Wang
Shiwei Zhang
Zhiwu Qing
Yuanjie Shao
Zhe Zuo
Changxin Gao
Nong Sang
OffRL
ViT
110
117
0
21 Jun 2021
Learning to Rank Question Answer Pairs with Bilateral Contrastive Data Augmentation
Yang Deng
Wenxuan Zhang
W. Lam
32
3
0
21 Jun 2021
Interventional Video Grounding with Dual Contrastive Learning
Guoshun Nan
Rui Qiao
Yao Xiao
Jun Liu
Sicong Leng
H. Zhang
Wei Lu
98
145
0
21 Jun 2021
Trust It or Not: Confidence-Guided Automatic Radiology Report Generation
Yixin Wang
Zihao Lin
Zhe Xu
Haoyu Dong
Jiang Tian
Jie Luo
Zhongchao Shi
Yang Zhang
Jianping Fan
Zhiqiang He
UQCV
MedIm
122
12
0
21 Jun 2021
ArgFuse: A Weakly-Supervised Framework for Document-Level Event Argument Aggregation
Debanjana Kar
S. Sarkar
Pawan Goyal
36
3
0
21 Jun 2021
CIL: Contrastive Instance Learning Framework for Distantly Supervised Relation Extraction
Tao Chen
Haizhou Shi
Siliang Tang
Zhigang Chen
Leilei Gan
Yueting Zhuang
61
56
0
21 Jun 2021
Controllable Context-aware Conversational Speech Synthesis
Jian Cong
Shan Yang
Na Hu
Guangzhi Li
Lei Xie
Jane Polak Scowcroft
75
30
0
21 Jun 2021
Out of Context: A New Clue for Context Modeling of Aspect-based Sentiment Analysis
Bowen Xing
Ivor W. Tsang
48
11
0
21 Jun 2021
ROPE: Reading Order Equivariant Positional Encoding for Graph-based Document Information Extraction
Chen-Yu Lee
Chun-Liang Li
Chu Wang
Renshen Wang
Yasuhisa Fujii
Siyang Qin
Ashok Popat
Tomas Pfister
57
26
0
21 Jun 2021
OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation
Jongmin Lee
Wonseok Jeon
Byung-Jun Lee
J. Pineau
Kee-Eung Kim
OffRL
194
101
0
21 Jun 2021
Context-Aware Legal Citation Recommendation using Deep Learning
Zihan Huang
Charles Low
Mengqiu Teng
Hongyi Zhang
Daniel E. Ho
M. Krass
Matthias Grabmair
AILaw
HAI
73
39
0
20 Jun 2021
CPM-2: Large-scale Cost-effective Pre-trained Language Models
Zhengyan Zhang
Yuxian Gu
Xu Han
Shengqi Chen
Chaojun Xiao
...
Minlie Huang
Wentao Han
Yang Liu
Xiaoyan Zhu
Maosong Sun
MoE
105
88
0
20 Jun 2021
Tag, Copy or Predict: A Unified Weakly-Supervised Learning Framework for Visual Information Extraction using Sequences
Jiapeng Wang
Tianwei Wang
Guozhi Tang
Lianwen Jin
Weihong Ma
Kai Ding
Yichao Huang
86
12
0
20 Jun 2021
More than Encoder: Introducing Transformer Decoder to Upsample
Yijiang Li
Wentian Cai
Ying Gao
Chengming Li
Xiping Hu
ViT
MedIm
87
55
0
20 Jun 2021
Do Encoder Representations of Generative Dialogue Models Encode Sufficient Information about the Task ?
Prasanna Parthasarathi
J. Pineau
Sarath Chandar
67
2
0
20 Jun 2021
TweeNLP: A Twitter Exploration Portal for Natural Language Processing
Viraj Shah
Shruti Singh
M. Singh
24
2
0
19 Jun 2021
JointGT: Graph-Text Joint Representation Learning for Text Generation from Knowledge Graphs
Pei Ke
Haozhe Ji
Yuanyuan Ran
Xin Cui
Liwei Wang
Linfeng Song
Xiaoyan Zhu
Minlie Huang
127
97
0
19 Jun 2021
Enhancing Question Generation with Commonsense Knowledge
Xin Jia
Hao Wang
D. Yin
Hao Sun
54
6
0
19 Jun 2021
QFCNN: Quantum Fourier Convolutional Neural Network
Feihong Shen
Jun Liu
78
6
0
19 Jun 2021
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Andreas Steiner
Alexander Kolesnikov
Xiaohua Zhai
Ross Wightman
Jakob Uszkoreit
Lucas Beyer
ViT
176
639
0
18 Jun 2021
Distributed Deep Learning in Open Collaborations
Michael Diskin
Alexey Bukhtiyarov
Max Ryabinin
Lucile Saulnier
Quentin Lhoest
...
Denis Mazur
Ilia Kobelev
Yacine Jernite
Thomas Wolf
Gennady Pekhimenko
FedML
137
59
0
18 Jun 2021
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models
Elad Ben-Zaken
Shauli Ravfogel
Yoav Goldberg
313
1,250
0
18 Jun 2021
Self-supervised Incremental Deep Graph Learning for Ethereum Phishing Scam Detection
Shucheng Li
Fengyuan Xu
Runchuan Wang
Sheng Zhong
GNN
72
21
0
18 Jun 2021
All You Can Embed: Natural Language based Vehicle Retrieval with Spatio-Temporal Transformers
Carmelo Scribano
D. Sapienza
Giorgia Franchini
M. Verucchi
Marko Bertogna
58
4
0
18 Jun 2021
Graph Context Encoder: Graph Feature Inpainting for Graph Generation and Self-supervised Pretraining
Oriel Frigo
Rémy Brossard
David Dehaene
55
1
0
18 Jun 2021
Label prompt for multi-label text classification
Rui Song
Xingbing Chen
Zelong Liu
Haining An
Zhiqi Zhang
Xiaoguang Wang
Hao Xu
VLM
60
4
0
18 Jun 2021
Weakly Supervised Pre-Training for Multi-Hop Retriever
Yeon Seonwoo
Sang-Woo Lee
Ji-Hoon Kim
Jung-Woo Ha
Alice Oh
RALM
82
8
0
18 Jun 2021
Graph-based Joint Pandemic Concern and Relation Extraction on Twitter
Jingli Shi
Weihua Li
Sira Yongchareon
Yi Yang
Quan-wei Bai
43
8
0
18 Jun 2021
Bad Characters: Imperceptible NLP Attacks
Nicholas Boucher
Ilia Shumailov
Ross J. Anderson
Nicolas Papernot
AAML
SILM
116
107
0
18 Jun 2021
PRGC: Potential Relation and Global Correspondence Based Joint Relational Triple Extraction
Heng Zheng
Rui Wen
Xi Chen
Yifan Yang
Yunyan Zhang
Ziheng Zhang
Ningyu Zhang
Bin Qin
Ming Xu
Yefeng Zheng
103
215
0
18 Jun 2021
Anomaly Detection in Dynamic Graphs via Transformer
Yixin Liu
Shirui Pan
Yu Guang Wang
Fei Xiong
Liang Wang
Qingfeng Chen
V. C. Lee
76
98
0
18 Jun 2021
Deep reinforcement learning with automated label extraction from clinical reports accurately classifies 3D MRI brain volumes
J. Stember
H. Shalu
52
23
0
17 Jun 2021
LNN-EL: A Neuro-Symbolic Approach to Short-text Entity Linking
Hang Jiang
Sairam Gurajada
Qiuhao Lu
S. Neelam
Lucian Popa
Prithviraj Sen
Yunyao Li
Alexander G. Gray
55
25
0
17 Jun 2021
Multi-Task Learning and Adapted Knowledge Models for Emotion-Cause Extraction
Elsbeth Turcan
Shuai Wang
Rishita Anubhai
Kasturi Bhattacharjee
Yaser Al-Onaizan
Smaranda Muresan
76
37
0
17 Jun 2021
Efficient Self-supervised Vision Transformers for Representation Learning
Chunyuan Li
Jianwei Yang
Pengchuan Zhang
Mei Gao
Bin Xiao
Xiyang Dai
Lu Yuan
Jianfeng Gao
ViT
116
214
0
17 Jun 2021
An Information Retrieval Approach to Building Datasets for Hate Speech Detection
Md. Mustafizur Rahman
Dinesh Balakrishnan
Dhiraj Murthy
Mucahid Kutlu
Matthew Lease
72
24
0
17 Jun 2021
Multi-mode Transformer Transducer with Stochastic Future Context
Kwangyoun Kim
Felix Wu
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
71
10
0
17 Jun 2021
Scientific Language Models for Biomedical Knowledge Base Completion: An Empirical Study
Rahul Nadkarni
David Wadden
Iz Beltagy
Noah A. Smith
Hannaneh Hajishirzi
Tom Hope
88
28
0
17 Jun 2021
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
887
10,661
0
17 Jun 2021
Multi-head or Single-head? An Empirical Comparison for Transformer Training
Liyuan Liu
Jialu Liu
Jiawei Han
71
33
0
17 Jun 2021
Learning Knowledge Graph-based World Models of Textual Environments
Prithviraj Ammanabrolu
Mark O. Riedl
3DV
102
32
0
17 Jun 2021
Classifying vaccine sentiment tweets by modelling domain-specific representation and commonsense knowledge into context-aware attentive GRU
Usman Naseem
Matloob Khushi
Jinman Kim
A. Dunn
54
12
0
17 Jun 2021
Modeling Worlds in Text
Prithviraj Ammanabrolu
Mark O. Riedl
VGen
LM&Ro
63
14
0
17 Jun 2021
Large-Scale Chemical Language Representations Capture Molecular Structure and Properties
Jerret Ross
Brian M. Belgodere
Vijil Chenthamarakshan
Inkit Padhi
Youssef Mroueh
Payel Das
AI4CE
91
305
0
17 Jun 2021
DravidianCodeMix: Sentiment Analysis and Offensive Language Identification Dataset for Dravidian Languages in Code-Mixed Text
Bharathi Raja Chakravarthi
R. Priyadharshini
Vigneshwaran Muralidaran
Navya Jose
Shardul Suryawanshi
E. Sherly
John P. Mccrae
64
107
0
17 Jun 2021
DocNLI: A Large-scale Dataset for Document-level Natural Language Inference
Wenpeng Yin
Dragomir R. Radev
Caiming Xiong
HILM
88
98
0
17 Jun 2021
Unsupervised Path Representation Learning with Curriculum Negative Sampling
Sean Bin Yang
Chenjuan Guo
Jilin Hu
Jiangtao Tang
Bin Yang
SSL
64
52
0
17 Jun 2021
X-FACT: A New Benchmark Dataset for Multilingual Fact Checking
Ashim Gupta
Vivek Srikumar
HILM
105
100
0
17 Jun 2021
Denoising Distantly Supervised Named Entity Recognition via a Hypergeometric Probabilistic Model
Wenkai Zhang
Hongyu Lin
Xianpei Han
Le Sun
Huidan Liu
Zhicheng Wei
N. Yuan
84
13
0
17 Jun 2021
Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases
Boxi Cao
Hongyu Lin
Xianpei Han
Le Sun
Lingyong Yan
M. Liao
Tong Xue
Jin Xu
82
136
0
17 Jun 2021
Previous
1
2
3
...
323
324
325
...
472
473
474
Next