Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04805
Cited By
v1
v2 (latest)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
50 / 23,708 papers shown
Title
XLM-E: Cross-lingual Language Model Pre-training via ELECTRA
Zewen Chi
Shaohan Huang
Li Dong
Shuming Ma
Bo Zheng
...
Payal Bajaj
Xia Song
Xian-Ling Mao
Heyan Huang
Furu Wei
123
121
0
30 Jun 2021
ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information
Zijun Sun
Xiaoya Li
Xiaofei Sun
Yuxian Meng
Xiang Ao
Qing He
Leilei Gan
Jiwei Li
SSeg
152
191
0
30 Jun 2021
A Generative Model for Raw Audio Using Transformer Architectures
Prateek Verma
C. Chafe
89
29
0
30 Jun 2021
AutoLAW: Augmented Legal Reasoning through Legal Precedent Prediction
Robert Mahari
ELM
AILaw
30
18
0
30 Jun 2021
ResViT: Residual vision transformers for multi-modal medical image synthesis
Onat Dalmaz
Mahmut Yurt
Tolga Çukur
ViT
MedIm
107
354
0
30 Jun 2021
Zero-Shot Estimation of Base Models' Weights in Ensemble of Machine Reading Comprehension Systems for Robust Generalization
Razieh Baradaran
Hossein Amirkhani
54
1
0
30 Jun 2021
Improving the Efficiency of Transformers for Resource-Constrained Devices
Hamid Tabani
Ajay Balasubramaniam
Shabbir Marzban
Elahe Arani
Bahram Zonooz
98
23
0
30 Jun 2021
What can linear interpolation of neural network loss landscapes tell us?
Tiffany J. Vlaar
Jonathan Frankle
MoMe
78
28
0
30 Jun 2021
On joint training with interfaces for spoken language understanding
A. Raju
Milind Rao
Gautam Tiwari
Pranav Dheram
Bryan Anderson
Zhe Zhang
Chul Lee
Bach Bui
Ariya Rastrow
VLM
55
11
0
30 Jun 2021
Whose Opinions Matter? Perspective-aware Models to Identify Opinions of Hate Speech Victims in Abusive Language Detection
S. Akhtar
Valerio Basile
V. Patti
72
61
0
30 Jun 2021
Incorporating Domain Knowledge for Extractive Summarization of Legal Case Documents
Paheli Bhattacharya
Soham Poddar
Koustav Rudra
Kripabandhu Ghosh
Saptarshi Ghosh
ELM
AILaw
77
70
0
30 Jun 2021
HySPA: Hybrid Span Generation for Scalable Text-to-Graph Extraction
Liliang Ren
Chenkai Sun
Heng Ji
Julia Hockenmaier
80
14
0
30 Jun 2021
Looking Outside the Window: Wide-Context Transformer for the Semantic Segmentation of High-Resolution Remote Sensing Images
L. Ding
Dong Lin
Shaofu Lin
Jing Zhang
Xiaojie Cui
Yuebin Wang
Hao Tang
Lorenzo Bruzzone
ViT
154
101
0
29 Jun 2021
Unified Questioner Transformer for Descriptive Question Generation in Goal-Oriented Visual Dialogue
Shoya Matsumori
Kosuke Shingyouchi
Yukikoko Abe
Yosuke Fukuchi
K. Sugiura
M. Imai
99
16
0
29 Jun 2021
Distributed Matrix Tiling Using A Hypergraph Labeling Formulation
Avah Banerjee
Guoli Ding
Maxwell Reeser
16
0
0
29 Jun 2021
On the Interaction of Belief Bias and Explanations
Ana Valeria González
Anna Rogers
Anders Søgaard
FAtt
87
19
0
29 Jun 2021
New Arabic Medical Dataset for Diseases Classification
Jaafar Hammoud
A. Vatian
N. Dobrenko
N. Vedernikov
A. Shalyto
N. Gusarova
OOD
73
6
0
29 Jun 2021
Exploring the Efficacy of Automatically Generated Counterfactuals for Sentiment Analysis
Linyi Yang
Jiazheng Li
Padraig Cunningham
Yue Zhang
Barry Smyth
Ruihai Dong
92
48
0
29 Jun 2021
Learning from Miscellaneous Other-Class Words for Few-shot Named Entity Recognition
Meihan Tong
Shuai Wang
Bin Xu
Yixin Cao
Minghui Liu
Lei Hou
Juan-Zi Li
111
53
0
29 Jun 2021
SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption
Dara Bahri
Heinrich Jiang
Yi Tay
Donald Metzler
SSL
79
178
0
29 Jun 2021
Neural Machine Translation for Low-Resource Languages: A Survey
Surangika Ranathunga
E. Lee
Marjana Prifti Skenduli
Ravi Shekhar
Mehreen Alam
Rishemjit Kaur
129
251
0
29 Jun 2021
Time-Aware Language Models as Temporal Knowledge Bases
Bhuwan Dhingra
Jeremy R. Cole
Julian Martin Eisenschlos
D. Gillick
Jacob Eisenstein
William W. Cohen
KELM
152
282
0
29 Jun 2021
Automatic Construction of Enterprise Knowledge Base
Junyi Chai
Yujie He
H. Hashemi
Bing Li
Daraksha Parveen
Ranganath Kondapally
Wenjin Xu
45
5
0
29 Jun 2021
Latent Execution for Neural Program Synthesis
Xinyun Chen
Basel Alomair
Yuandong Tian
NAI
127
53
0
29 Jun 2021
Benchmarking Knowledge-driven Zero-shot Learning
Yuxia Geng
Jiaoyan Chen
Zhuang Xiang
Zhuo Chen
Jeff Z. Pan
Juan Li
Zonggang Yuan
Huajun Chen
VLM
103
19
0
29 Jun 2021
On component interactions in two-stage recommender systems
Jiri Hron
K. Krauth
Michael I. Jordan
Niki Kilbertus
CML
LRM
76
31
0
28 Jun 2021
Early Convolutions Help Transformers See Better
Tete Xiao
Mannat Singh
Eric Mintun
Trevor Darrell
Piotr Dollár
Ross B. Girshick
120
778
0
28 Jun 2021
Training Massive Deep Neural Networks in a Smart Contract: A New Hope
Yin Yang
32
2
0
28 Jun 2021
TENT: Tensorized Encoder Transformer for Temperature Forecasting
Onur Bilgin
Paweł Mąka
Thomas Vergutz
S. Mehrkanoon
AI4TS
71
13
0
28 Jun 2021
Knowledge Transfer by Discriminative Pre-training for Academic Performance Prediction
Byungsoo Kim
Hangyeol Yu
Dongmin Shin
Youngduck Choi
39
1
0
28 Jun 2021
Enhancing the Generalization for Intent Classification and Out-of-Domain Detection in SLU
Yilin Shen
Yen-Chang Hsu
Avik Ray
Hongxia Jin
62
41
0
28 Jun 2021
RadGraph: Extracting Clinical Entities and Relations from Radiology Reports
Saahil Jain
Ashwin Agrawal
A. Saporta
Steven QH Truong
D. Duong
...
Yuhao Zhang
M. Lungren
A. Ng
C. Langlotz
Pranav Rajpurkar
MedIm
104
214
0
28 Jun 2021
R-Drop: Regularized Dropout for Neural Networks
Xiaobo Liang
Lijun Wu
Juntao Li
Yue Wang
Qi Meng
Tao Qin
Wei Chen
Hao Fei
Tie-Yan Liu
90
442
0
28 Jun 2021
Traditional Machine Learning and Deep Learning Models for Argumentation Mining in Russian Texts
Irina Fishcheva
Valeriya Goloviznina
Evgeny Kotelnikov
59
9
0
28 Jun 2021
Current Landscape of the Russian Sentiment Corpora
Evgeny Kotelnikov
76
4
0
28 Jun 2021
Modelling Monotonic and Non-Monotonic Attribute Dependencies with Embeddings: A Theoretical Analysis
Steven Schockaert
51
2
0
28 Jun 2021
Political Ideology and Polarization of Policy Positions: A Multi-dimensional Approach
Barea M. Sinno
Bernardo Oviedo
Katherine Atwell
Malihe Alikhani
Junjie Li
50
3
0
28 Jun 2021
A Span-Based Model for Joint Overlapped and Discontinuous Named Entity Recognition
Fei Li
Zhichao Lin
Meishan Zhang
Donghong Ji
40
98
0
28 Jun 2021
Word2Box: Capturing Set-Theoretic Semantics of Words using Box Embeddings
S. Dasgupta
Michael Boratko
Siddhartha Mishra
Shriya Atmakuri
Dhruvesh Patel
Xiang Lorraine Li
Andrew McCallum
NAI
76
21
0
28 Jun 2021
A Closer Look at How Fine-tuning Changes BERT
Yichu Zhou
Vivek Srikumar
84
68
0
27 Jun 2021
Deep Learning for Technical Document Classification
Shuo Jiang
Jie Hu
C. Magee
Jianxi Luo
95
44
0
27 Jun 2021
Pairing Conceptual Modeling with Machine Learning
W. Maass
V. Storey
HAI
77
36
0
27 Jun 2021
Open, Sesame! Introducing Access Control to Voice Services
Dominika Woszczyk
Alvin Lee
Soteris Demetriou
AAML
28
1
0
27 Jun 2021
Persian Causality Corpus (PerCause) and the Causality Detection Benchmark
Zeinab Rahimi
M. Shamsfard
31
1
0
27 Jun 2021
A Cascade Dual-Decoder Model for Joint Entity and Relation Extraction
Jian Cheng
Tian Zhang
Shuang Zhang
Huimin Ren
Guo-Ding Yu
Xiliang Zhang
Shangce Gao
Lianbo Ma
74
16
0
27 Jun 2021
Analyzing Research Trends in Inorganic Materials Literature Using NLP
Fusataka Kuniyoshi
Jun Ozawa
Makoto Miwa
47
7
0
27 Jun 2021
Post-Training Quantization for Vision Transformer
Zhenhua Liu
Yunhe Wang
Kai Han
Siwei Ma
Wen Gao
ViT
MQ
143
348
0
27 Jun 2021
Visual Conceptual Blending with Large-scale Language and Vision Models
Songwei Ge
Devi Parikh
VLM
DiffM
69
14
0
27 Jun 2021
Time-Series Representation Learning via Temporal and Contextual Contrasting
Emadeldeen Eldele
Mohamed Ragab
Zhenghua Chen
Min-man Wu
C. Kwoh
Xiaoli Li
Cuntai Guan
AI4TS
104
518
0
26 Jun 2021
Interflow: Aggregating Multi-layer Feature Mappings with Attention Mechanism
Zhicheng Cai
37
1
0
26 Jun 2021
Previous
1
2
3
...
321
322
323
...
473
474
475
Next