Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04805
Cited By
v1
v2 (latest)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
50 / 23,641 papers shown
Title
BERTGEN: Multi-task Generation through BERT
Faidon Mitzalis
Ozan Caglayan
Pranava Madhyastha
Lucia Specia
VLM
48
7
0
07 Jun 2021
Relative Importance in Sentence Processing
Nora Hollenstein
Lisa Beinborn
FAtt
82
32
0
07 Jun 2021
Never guess what I heard... Rumor Detection in Finnish News: a Dataset and a Baseline
Mika Hämäläinen
Khalid Alnajjar
N. Partanen
Jack Rueter
52
7
0
07 Jun 2021
LAWDR: Language-Agnostic Weighted Document Representations from Pre-trained Models
Hongyu Gong
Vishrav Chaudhary
Yuqing Tang
Francisco Guzmán
37
3
0
07 Jun 2021
A Globally Normalized Neural Model for Semantic Parsing
Chenyang Huang
Wei Yang
Yanshuai Cao
Osmar Zaïane
Lili Mou
75
3
0
07 Jun 2021
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
ViT
221
342
0
07 Jun 2021
SelfDoc: Self-Supervised Document Representation Learning
Peizhao Li
Jiuxiang Gu
Jason Kuen
Vlad I. Morariu
Handong Zhao
R. Jain
Varun Manjunatha
Hongfu Liu
ViT
SSL
87
162
0
07 Jun 2021
Video Instance Segmentation using Inter-Frame Communication Transformers
Sukjun Hwang
Miran Heo
Seoung Wug Oh
Seon Joo Kim
ViT
136
139
0
07 Jun 2021
Meta-learning for downstream aware and agnostic pretraining
Hongyin Luo
Shuyan Dong
Yung-Sung Chuang
Shang-Wen Li
67
0
0
06 Jun 2021
Tabular Data: Deep Learning is Not All You Need
Ravid Shwartz-Ziv
Amitai Armon
LMTD
205
1,304
0
06 Jun 2021
Let's be explicit about that: Distant supervision for implicit discourse relation classification via connective prediction
Murathan Kurfali
Robert Östling
50
19
0
06 Jun 2021
Transient Chaos in BERT
Katsuma Inoue
Soh Ohara
Yasuo Kuniyoshi
Kohei Nakajima
72
3
0
06 Jun 2021
On the Effectiveness of Adapter-based Tuning for Pretrained Language Model Adaptation
Ruidan He
Linlin Liu
Hai Ye
Qingyu Tan
Bosheng Ding
Liying Cheng
Jia-Wei Low
Lidong Bing
Luo Si
65
205
0
06 Jun 2021
Oriented Object Detection with Transformer
Teli Ma
Mingyuan Mao
Honghui Zheng
Peng Gao
Xiaodi Wang
Shumin Han
Errui Ding
Baochang Zhang
David Doermann
ViT
59
44
0
06 Jun 2021
Enhancing Label Correlation Feedback in Multi-Label Text Classification via Multi-Task Learning
Ximing Zhang
Qian-Wen Zhang
Zhao Yan
Ruifang Liu
Yunbo Cao
104
49
0
06 Jun 2021
TabularNet: A Neural Network Architecture for Understanding Semantic Structures of Tabular Data
Lun Du
Fei Gao
Xu Chen
Ran Jia
Junshan Wang
Jiang Zhang
Shi Han
Dongmei Zhang
LMTD
76
81
0
06 Jun 2021
Referring Transformer: A One-step Approach to Multi-task Visual Grounding
Muchen Li
Leonid Sigal
ObjD
119
197
0
06 Jun 2021
Semantic-Enhanced Explainable Finetuning for Open-Domain Dialogues
Yinhe Zheng
Yida Wang
Pei Ke
Zhenyu Yang
Minlie Huang
78
4
0
06 Jun 2021
How Did This Get Funded?! Automatically Identifying Quirky Scientific Achievements
Chen Shani
Nadav Borenstein
Dafna Shahaf
55
4
0
06 Jun 2021
Empowering Language Understanding with Counterfactual Reasoning
Fuli Feng
Jizhi Zhang
Xiangnan He
Hanwang Zhang
Tat-Seng Chua
LRM
77
34
0
06 Jun 2021
Emotion-aware Chat Machine: Automatic Emotional Response Generation for Human-like Emotional Interaction
Wei Wei
Jiayi Liu
Xian-Ling Mao
G. Guo
Feida Zhu
Pan Zhou
Yuchong Hu
102
56
0
06 Jun 2021
Embracing Ambiguity: Shifting the Training Target of NLI Models
Johannes Mario Meissner
Napat Thumwanit
Saku Sugawara
Akiko Aizawa
77
22
0
06 Jun 2021
Exploring the Limits of Out-of-Distribution Detection
Stanislav Fort
Jie Jessie Ren
Balaji Lakshminarayanan
113
342
0
06 Jun 2021
Causal Abstractions of Neural Networks
Atticus Geiger
Hanson Lu
Thomas Icard
Christopher Potts
NAI
CML
80
246
0
06 Jun 2021
Enhancing Taxonomy Completion with Concept Generation via Fusing Relational Representations
Qingkai Zeng
Jinfeng Lin
Wenhao Yu
J. Cleland-Huang
Meng Jiang
71
46
0
05 Jun 2021
Meta-Learning with Variational Semantic Memory for Word Sense Disambiguation
Yingjun Du
Nithin Holla
Xiantong Zhen
Cees G. M. Snoek
Ekaterina Shutova
80
9
0
05 Jun 2021
BERTnesia: Investigating the capture and forgetting of knowledge in BERT
Jonas Wallat
Jaspreet Singh
Avishek Anand
CLL
KELM
148
60
0
05 Jun 2021
Patch Slimming for Efficient Vision Transformers
Yehui Tang
Kai Han
Yunhe Wang
Chang Xu
Jianyuan Guo
Chao Xu
Dacheng Tao
ViT
124
168
0
05 Jun 2021
MergeDistill: Merging Pre-trained Language Models using Distillation
Simran Khanuja
Melvin Johnson
Partha P. Talukdar
84
16
0
05 Jun 2021
Lifelong Learning of Hate Speech Classification on Social Media
Jing Qian
Hong Wang
Mai Elsherief
Xifeng Yan
CLL
57
24
0
05 Jun 2021
Learnable Fourier Features for Multi-Dimensional Spatial Positional Encoding
Yang Li
Si Si
Gang Li
Cho-Jui Hsieh
Samy Bengio
105
96
0
05 Jun 2021
Weakly-Supervised Methods for Suicide Risk Assessment: Role of Related Domains
Chenghao Yang
Yudong Zhang
Smaranda Muresan
AI4MH
65
5
0
05 Jun 2021
Exposing the Implicit Energy Networks behind Masked Language Models via Metropolis--Hastings
Kartik Goyal
Chris Dyer
Taylor Berg-Kirkpatrick
178
51
0
04 Jun 2021
MultiOpEd: A Corpus of Multi-Perspective News Editorials
Siyi Liu
Sihao Chen
Xander Uyttendaele
Dan Roth
41
13
0
04 Jun 2021
The R-U-A-Robot Dataset: Helping Avoid Chatbot Deception by Detecting User Questions About Human or Non-Human Identity
David Gros
Yu Li
Zhou Yu
DeLMO
74
20
0
04 Jun 2021
RegionViT: Regional-to-Local Attention for Vision Transformers
Chun-Fu Chen
Yikang Shen
Quanfu Fan
ViT
148
200
0
04 Jun 2021
Associating Objects with Transformers for Video Object Segmentation
Zongxin Yang
Yunchao Wei
Yi Yang
142
298
0
04 Jun 2021
MERLOT: Multimodal Neural Script Knowledge Models
Rowan Zellers
Ximing Lu
Jack Hessel
Youngjae Yu
J. S. Park
Jize Cao
Ali Farhadi
Yejin Choi
VLM
LRM
110
384
0
04 Jun 2021
Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning
Jannik Kossen
Neil Band
Clare Lyle
Aidan Gomez
Tom Rainforth
Y. Gal
OOD
3DPC
133
142
0
04 Jun 2021
BERT-Based Sentiment Analysis: A Software Engineering Perspective
Himanshu Batra
Narinder Singh Punn
S. K. Sonbhadra
Sonali Agarwal
110
36
0
04 Jun 2021
Neural semi-Markov CRF for Monolingual Word Alignment
Wuwei Lan
Chao Jiang
Wei Xu
BDL
75
18
0
04 Jun 2021
Great Service! Fine-grained Parsing of Implicit Arguments
Ruixiang Cui
Daniel Hershcovich
68
2
0
04 Jun 2021
Do Syntactic Probes Probe Syntax? Experiments with Jabberwocky Probing
Rowan Hall Maudslay
Ryan Cotterell
95
34
0
04 Jun 2021
CLIP: A Dataset for Extracting Action Items for Physicians from Hospital Discharge Notes
J. Mullenbach
Yada Pruksachatkun
Sean Adler
Jennifer Seale
Jordan Swartz
T. McKelvey
Hui Dai
Yi Yang
David Sontag
76
16
0
04 Jun 2021
The Image Local Autoregressive Transformer
Chenjie Cao
Yue Hong
Xiang Li
Chengrong Wang
C. Xu
Xiangyang Xue
Yanwei Fu
82
13
0
04 Jun 2021
You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient
Shaokun Zhang
Xiawu Zheng
Chenyi Yang
Yuchao Li
Yan Wang
Yong Li
Mengdi Wang
Shen Li
Jun Yang
Rongrong Ji
MQ
97
23
0
04 Jun 2021
Entity Concept-enhanced Few-shot Relation Extraction
Shan Yang
Yongfei Zhang
Guanglin Niu
Qinghua Zhao
Shiliang Pu
78
69
0
04 Jun 2021
Prediction or Comparison: Toward Interpretable Qualitative Reasoning
Mucheng Ren
Heyan Huang
Yang Gao
LRM
36
0
0
04 Jun 2021
Annotation Curricula to Implicitly Train Non-Expert Annotators
Ji-Ung Lee
Jan-Christoph Klie
Iryna Gurevych
76
11
0
04 Jun 2021
Learning Slice-Aware Representations with Mixture of Attentions
Cheng Wang
Sungjin Lee
Sunghyun Park
Han Li
Young-Bum Kim
R. Sarikaya
71
2
0
04 Jun 2021
Previous
1
2
3
...
328
329
330
...
471
472
473
Next