ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 23,708 papers shown
Title
Czert -- Czech BERT-like Model for Language Representation
Czert -- Czech BERT-like Model for Language Representation
Jakub Sido
O. Pražák
P. Pribán
Jan Pasek
Michal Seják
Miloslav Konopík
79
44
0
24 Mar 2021
AutoMix: Unveiling the Power of Mixup for Stronger Classifiers
AutoMix: Unveiling the Power of Mixup for Stronger Classifiers
Zicheng Liu
Siyuan Li
Di Wu
Jianzhu Guo
Zhiyuan Chen
Lirong Wu
Stan Z. Li
121
78
0
24 Mar 2021
Can Vision Transformers Learn without Natural Images?
Can Vision Transformers Learn without Natural Images?
Kodai Nakashima
Hirokatsu Kataoka
Asato Matsumoto
K. Iwata
Nakamasa Inoue
ViT
57
34
0
24 Mar 2021
deGraphCS: Embedding Variable-based Flow Graph for Neural Code Search
deGraphCS: Embedding Variable-based Flow Graph for Neural Code Search
Chen Zeng
Yue Yu
Shanshan Li
Xin Xia
Zhiming Wang
Mingyang Geng
Linxiao Bai
Wei Dong
Xiangke Liao
GNN
87
38
0
24 Mar 2021
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New
  Multitask Benchmark
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark
Nicholas Lourie
Ronan Le Bras
Chandra Bhagavatula
Yejin Choi
LRM
115
140
0
24 Mar 2021
Relation-aware Instance Refinement for Weakly Supervised Visual
  Grounding
Relation-aware Instance Refinement for Weakly Supervised Visual Grounding
Yongfei Liu
Bo Wan
Lin Ma
Xuming He
ObjD
94
57
0
24 Mar 2021
VLGrammar: Grounded Grammar Induction of Vision and Language
VLGrammar: Grounded Grammar Induction of Vision and Language
Yining Hong
Qing Li
Song-Chun Zhu
Siyuan Huang
VLM
89
25
0
24 Mar 2021
Multi-view 3D Reconstruction with Transformer
Multi-view 3D Reconstruction with Transformer
Dan Wang
Xinrui Cui
Xun Chen
Zhengxia Zou
Tianyang Shi
Septimiu Salcudean
Z. J. Wang
Rabab Ward
ViT
79
90
0
24 Mar 2021
NaturalProofs: Mathematical Theorem Proving in Natural Language
NaturalProofs: Mathematical Theorem Proving in Natural Language
Sean Welleck
Jiacheng Liu
Ronan Le Bras
Hannaneh Hajishirzi
Yejin Choi
Kyunghyun Cho
AIMat
98
69
0
24 Mar 2021
Supporting Clustering with Contrastive Learning
Supporting Clustering with Contrastive Learning
Dejiao Zhang
Feng Nan
Xiaokai Wei
Shang-Wen Li
Henghui Zhu
Kathleen McKeown
Ramesh Nallapati
Andrew O. Arnold
Bing Xiang
SSL
117
204
0
24 Mar 2021
Scene-Intuitive Agent for Remote Embodied Visual Grounding
Scene-Intuitive Agent for Remote Embodied Visual Grounding
Xiangru Lin
Guanbin Li
Yizhou Yu
LM&Ro
80
53
0
24 Mar 2021
Region Similarity Representation Learning
Region Similarity Representation Learning
Tete Xiao
Colorado Reed
Xiaolong Wang
Kurt Keutzer
Trevor Darrell
VLMSSL
101
118
0
24 Mar 2021
Complex Factoid Question Answering with a Free-Text Knowledge Graph
Complex Factoid Question Answering with a Free-Text Knowledge Graph
Chen Zhao
Chenyan Xiong
Xin Qian
Jordan L. Boyd-Graber
82
38
0
23 Mar 2021
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning
  Architectures
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures
Sushant Singh
A. Mahmood
AI4TS
124
96
0
23 Mar 2021
Variable Name Recovery in Decompiled Binary Code using Constrained
  Masked Language Modeling
Variable Name Recovery in Decompiled Binary Code using Constrained Masked Language Modeling
Pratyay Banerjee
Kuntal Kumar Pal
Fish Wang
Chitta Baral
57
13
0
23 Mar 2021
Self-Supervised Pretraining Improves Self-Supervised Pretraining
Self-Supervised Pretraining Improves Self-Supervised Pretraining
Colorado Reed
Xiangyu Yue
Aniruddha Nrusimha
Sayna Ebrahimi
Vivek Vijaykumar
...
Shanghang Zhang
Devin Guillory
Sean L. Metzger
Kurt Keutzer
Trevor Darrell
142
108
0
23 Mar 2021
A Pseudo-Metric between Probability Distributions based on Depth-Trimmed
  Regions
A Pseudo-Metric between Probability Distributions based on Depth-Trimmed Regions
Guillaume Staerman
Pavlo Mozharovskyi
Pierre Colombo
Stéphan Clémenccon
Florence dÁlché-Buc
OOD
582
17
0
23 Mar 2021
QuestEval: Summarization Asks for Fact-based Evaluation
QuestEval: Summarization Asks for Fact-based Evaluation
Thomas Scialom
Paul-Alexis Dray
Patrick Gallinari
Sylvain Lamprier
Benjamin Piwowarski
Jacopo Staiano
Alex Jinpeng Wang
HILM
82
277
0
23 Mar 2021
How to decay your learning rate
How to decay your learning rate
Aitor Lewkowycz
114
24
0
23 Mar 2021
Self-supervised representation learning from 12-lead ECG data
Self-supervised representation learning from 12-lead ECG data
Temesgen Mehari
Nils Strodthoff
SSL
102
143
0
23 Mar 2021
Multilingual Autoregressive Entity Linking
Multilingual Autoregressive Entity Linking
Nicola De Cao
Ledell Yu Wu
Kashyap Popat
Mikel Artetxe
Naman Goyal
Mikhail Plekhanov
Luke Zettlemoyer
Nicola Cancedda
Sebastian Riedel
Fabio Petroni
LRM
101
88
0
23 Mar 2021
Are Neural Language Models Good Plagiarists? A Benchmark for Neural
  Paraphrase Detection
Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection
Jan Philip Wahle
Terry Ruas
Norman Meuschke
Bela Gipp
121
34
0
23 Mar 2021
Redefining Absent Keyphrases and their Effect on Retrieval Effectiveness
Redefining Absent Keyphrases and their Effect on Retrieval Effectiveness
Florian Boudin
Ygor Gallina
65
16
0
23 Mar 2021
Modeling the Severity of Complaints in Social Media
Modeling the Severity of Complaints in Social Media
Mali Jin
Nikolaos Aletras
44
20
0
23 Mar 2021
Detecting Hate Speech with GPT-3
Detecting Hate Speech with GPT-3
Ke-Li Chiu
Annie Collins
Rohan Alexander
AILaw
112
114
0
23 Mar 2021
Exercise? I thought you said Éxtra Fries': Leveraging Sentence
  Demarcations and Multi-hop Attention for Meme Affect Analysis
Exercise? I thought you said Éxtra Fries': Leveraging Sentence Demarcations and Multi-hop Attention for Meme Affect Analysis
Shraman Pramanick
Md. Shad Akhtar
Tanmoy Chakraborty
85
17
0
23 Mar 2021
TMR: Evaluating NER Recall on Tough Mentions
TMR: Evaluating NER Recall on Tough Mentions
Jingxuan Tu
Constantine Lignos
61
4
0
23 Mar 2021
SelfExplain: A Self-Explaining Architecture for Neural Text Classifiers
SelfExplain: A Self-Explaining Architecture for Neural Text Classifiers
Dheeraj Rajagopal
Vidhisha Balachandran
Eduard H. Hovy
Yulia Tsvetkov
MILMSSLFAttAI4TS
100
68
0
23 Mar 2021
Multi-Modal Answer Validation for Knowledge-Based VQA
Multi-Modal Answer Validation for Knowledge-Based VQA
Jialin Wu
Jiasen Lu
Ashish Sabharwal
Roozbeh Mottaghi
186
146
0
23 Mar 2021
Instance-level Image Retrieval using Reranking Transformers
Instance-level Image Retrieval using Reranking Transformers
Fuwen Tan
Jiangbo Yuan
Vicente Ordonez
ViT
176
93
0
22 Mar 2021
Tiny Transformers for Environmental Sound Classification at the Edge
Tiny Transformers for Environmental Sound Classification at the Edge
David Elliott
Carlos E. Otero
Steven Wyatt
Evan Martino
83
16
0
22 Mar 2021
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets
Julia Kreutzer
Isaac Caswell
Lisa Wang
Ahsan Wahab
D. Esch
...
Duygu Ataman
Orevaoghene Ahia
Oghenefego Ahia
Sweta Agrawal
Mofetoluwa Adeyemi
87
280
0
22 Mar 2021
Open Domain Question Answering over Tables via Dense Retrieval
Open Domain Question Answering over Tables via Dense Retrieval
Jonathan Herzig
Thomas Müller
Syrine Krichene
Julian Martin Eisenschlos
LMTDVLMRALM
123
105
0
22 Mar 2021
Improving and Simplifying Pattern Exploiting Training
Improving and Simplifying Pattern Exploiting Training
Derek Tam
Rakesh R Menon
Joey Tianyi Zhou
Shashank Srivastava
Colin Raffel
89
151
0
22 Mar 2021
BERT: A Review of Applications in Natural Language Processing and
  Understanding
BERT: A Review of Applications in Natural Language Processing and Understanding
M. V. Koroteev
VLM
136
226
0
22 Mar 2021
Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for
  Improved Cross-Modal Retrieval
Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for Improved Cross-Modal Retrieval
Gregor Geigle
Jonas Pfeiffer
Nils Reimers
Ivan Vulić
Iryna Gurevych
116
60
0
22 Mar 2021
Identifying Machine-Paraphrased Plagiarism
Identifying Machine-Paraphrased Plagiarism
Jan Philip Wahle
Terry Ruas
Tomávs Foltýnek
Norman Meuschke
Bela Gipp
93
32
0
22 Mar 2021
DeepViT: Towards Deeper Vision Transformer
DeepViT: Towards Deeper Vision Transformer
Daquan Zhou
Bingyi Kang
Xiaojie Jin
Linjie Yang
Xiaochen Lian
Zihang Jiang
Qibin Hou
Jiashi Feng
ViT
179
525
0
22 Mar 2021
MasakhaNER: Named Entity Recognition for African Languages
MasakhaNER: Named Entity Recognition for African Languages
David Ifeoluwa Adelani
Jade Z. Abbott
Graham Neubig
Daniel D'souza
Julia Kreutzer
...
T. Diop
A. Diallo
Adewale Akinfaderin
T. Marengereke
Salomey Osei
126
195
0
22 Mar 2021
Learning physical properties of anomalous random walks using graph
  neural networks
Learning physical properties of anomalous random walks using graph neural networks
Hippolyte Verdier
M. Duval
François Laurent
Alhassan Cassé
Christian L. Vestergaard
Jean-Baptiste Masson
71
25
0
22 Mar 2021
Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual
  Tracking
Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking
Ning Wang
Wen-gang Zhou
Jie Wang
Houqiang Li
ViT
110
535
0
22 Mar 2021
Prototypical Representation Learning for Relation Extraction
Prototypical Representation Learning for Relation Extraction
Ning Ding
Xiaobin Wang
Yao Fu
Guangwei Xu
Rui Wang
Pengjun Xie
Ying Shen
Fei Huang
Haitao Zheng
Rui Zhang
66
60
0
22 Mar 2021
Multimodal Motion Prediction with Stacked Transformers
Multimodal Motion Prediction with Stacked Transformers
Yicheng Liu
Jinghuai Zhang
Liangji Fang
Qinhong Jiang
Bolei Zhou
112
365
0
22 Mar 2021
Grey-box Adversarial Attack And Defence For Sentiment Classification
Grey-box Adversarial Attack And Defence For Sentiment Classification
Ying Xu
Xu Zhong
Antonio Jimeno Yepes
Jey Han Lau
VLMAAML
70
54
0
22 Mar 2021
A Large-scale Dataset for Hate Speech Detection on Vietnamese Social
  Media Texts
A Large-scale Dataset for Hate Speech Detection on Vietnamese Social Media Texts
Son T. Luu
Kiet Van Nguyen
Ngan Luu-Thuy Nguyen
82
35
0
22 Mar 2021
Paying Attention to Activation Maps in Camera Pose Regression
Paying Attention to Activation Maps in Camera Pose Regression
Yoli Shavit
Ron Ferens
Y. Keller
ViT
78
13
0
21 Mar 2021
Exploiting Method Names to Improve Code Summarization: A Deliberation
  Multi-Task Learning Approach
Exploiting Method Names to Improve Code Summarization: A Deliberation Multi-Task Learning Approach
Rui Xie
Wei Ye
Jinan Sun
Shikun Zhang
65
28
0
21 Mar 2021
L3CubeMahaSent: A Marathi Tweet-based Sentiment Analysis Dataset
L3CubeMahaSent: A Marathi Tweet-based Sentiment Analysis Dataset
Atharva Kulkarni
Meet Mandhane
Manali Likhitkar
G. Kshirsagar
Raviraj Joshi
97
56
0
21 Mar 2021
MaAST: Map Attention with Semantic Transformersfor Efficient Visual
  Navigation
MaAST: Map Attention with Semantic Transformersfor Efficient Visual Navigation
Zachary Seymour
Kowshik Thopalli
Niluthpol Chowdhury Mithun
Han-Pang Chiu
S. Samarasekera
Rakesh Kumar
3DPC
69
18
0
21 Mar 2021
ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques
ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques
Yuanxin Liu
Zheng Lin
Fengcheng Yuan
VLMMQ
83
20
0
21 Mar 2021
Previous
123...353354355...473474475
Next