ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 23,708 papers shown
Title
An Efficient DP-SGD Mechanism for Large Scale NLP Models
An Efficient DP-SGD Mechanism for Large Scale NLP Models
Christophe Dupuy
Radhika Arava
Rahul Gupta
Anna Rumshisky
SyDa
99
36
0
14 Jul 2021
DeepMutants: Training neural bug detectors with contextual mutations
DeepMutants: Training neural bug detectors with contextual mutations
Cedric Richter
Heike Wehrheim
98
3
0
14 Jul 2021
Self-Supervised Multi-Modal Alignment for Whole Body Medical Imaging
Self-Supervised Multi-Modal Alignment for Whole Body Medical Imaging
Rhydian Windsor
A. Jamaludin
T. Kadir
Andrew Zisserman
74
16
0
14 Jul 2021
Transformer with Peak Suppression and Knowledge Guidance for
  Fine-grained Image Recognition
Transformer with Peak Suppression and Knowledge Guidance for Fine-grained Image Recognition
Xinda Liu
Lili Wang
Xiaoguang Han
ViT
104
70
0
14 Jul 2021
Serialized Multi-Layer Multi-Head Attention for Neural Speaker Embedding
Serialized Multi-Layer Multi-Head Attention for Neural Speaker Embedding
Hongning Zhu
Kong Aik Lee
Haizhou Li
78
15
0
14 Jul 2021
From Machine Translation to Code-Switching: Generating High-Quality
  Code-Switched Text
From Machine Translation to Code-Switching: Generating High-Quality Code-Switched Text
Ishan Tarunesh
Syamantak Kumar
Preethi Jyothi
98
46
0
14 Jul 2021
Model-Parallel Model Selection for Deep Learning Systems
Model-Parallel Model Selection for Deep Learning Systems
Kabir Nagrecha
94
17
0
14 Jul 2021
Using BERT Encoding to Tackle the Mad-lib Attack in SMS Spam Detection
Using BERT Encoding to Tackle the Mad-lib Attack in SMS Spam Detection
S. R. Galeano
79
18
0
13 Jul 2021
How Much Can CLIP Benefit Vision-and-Language Tasks?
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Joey Tianyi Zhou
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIPVLMMLLM
280
412
0
13 Jul 2021
A Dialogue-based Information Extraction System for Medical Insurance
  Assessment
A Dialogue-based Information Extraction System for Medical Insurance Assessment
Shuang Peng
Mengdi Zhou
Minghui Yang
Haitao Mi
Shaosheng Cao
Zujie Wen
Teng Xu
Hongbin Wang
Lei Liu
57
4
0
13 Jul 2021
FairyTailor: A Multimodal Generative Framework for Storytelling
FairyTailor: A Multimodal Generative Framework for Storytelling
Eden Bensaid
Mauro Martino
Benjamin Hoover
Hendrik Strobelt
LRM
77
20
0
13 Jul 2021
Visual Parser: Representing Part-whole Hierarchies with Transformers
Visual Parser: Representing Part-whole Hierarchies with Transformers
Shuyang Sun
Xiaoyu Yue
S. Bai
Philip Torr
128
27
0
13 Jul 2021
Combiner: Full Attention Transformer with Sparse Computation Cost
Combiner: Full Attention Transformer with Sparse Computation Cost
Hongyu Ren
H. Dai
Zihang Dai
Mengjiao Yang
J. Leskovec
Dale Schuurmans
Bo Dai
180
80
0
12 Jul 2021
SoftHebb: Bayesian Inference in Unsupervised Hebbian Soft
  Winner-Take-All Networks
SoftHebb: Bayesian Inference in Unsupervised Hebbian Soft Winner-Take-All Networks
Timoleon Moraitis
Dmitry Toichkin
Adrien Journé
Yansong Chua
Qinghai Guo
AAMLBDL
176
29
0
12 Jul 2021
Revisiting Uncertainty-based Query Strategies for Active Learning with
  Transformers
Revisiting Uncertainty-based Query Strategies for Active Learning with Transformers
Christopher Schröder
A. Niekler
Martin Potthast
102
81
0
12 Jul 2021
Accenture at CheckThat! 2021: Interesting claim identification and
  ranking with contextually sensitive lexical training data augmentation
Accenture at CheckThat! 2021: Interesting claim identification and ranking with contextually sensitive lexical training data augmentation
Evan Williams
Paul Rodrigues
Sieu Tran
227
20
0
12 Jul 2021
Codified audio language modeling learns useful representations for music
  information retrieval
Codified audio language modeling learns useful representations for music information retrieval
Rodrigo Castellon
Chris Donahue
Percy Liang
148
91
0
12 Jul 2021
End-to-end Multi-modal Video Temporal Grounding
End-to-end Multi-modal Video Temporal Grounding
Yi-Wen Chen
Yi-Hsuan Tsai
Ming-Hsuan Yang
78
51
0
12 Jul 2021
A Persistent Spatial Semantic Representation for High-level Natural
  Language Instruction Execution
A Persistent Spatial Semantic Representation for High-level Natural Language Instruction Execution
Valts Blukis
Chris Paxton
Dieter Fox
Animesh Garg
Yoav Artzi
LM&Ro
279
139
0
12 Jul 2021
Denoising User-aware Memory Network for Recommendation
Denoising User-aware Memory Network for Recommendation
Zhiwei. Bian
Shaojun Zhou
Hao Xu
Qihong Yang
Zhenqi Sun
Junjie Tang
Guiquan Liu
Kaikui Liu
Xiaolong Li
HAIAI4TS
108
36
0
12 Jul 2021
Trustworthy AI: A Computational Perspective
Trustworthy AI: A Computational Perspective
Haochen Liu
Yiqi Wang
Wenqi Fan
Xiaorui Liu
Yaxin Li
Shaili Jain
Yunhao Liu
Anil K. Jain
Jiliang Tang
FaML
201
213
0
12 Jul 2021
CoBERL: Contrastive BERT for Reinforcement Learning
CoBERL: Contrastive BERT for Reinforcement Learning
Andrea Banino
Adria Puidomenech Badia
Jacob Walker
Tim Scholtes
Jovana Mitrović
Charles Blundell
OffRL
88
36
0
12 Jul 2021
MECT: Multi-Metadata Embedding based Cross-Transformer for Chinese Named
  Entity Recognition
MECT: Multi-Metadata Embedding based Cross-Transformer for Chinese Named Entity Recognition
Shuang Wu
Xiaoning Song
Zhenhua Feng
84
117
0
12 Jul 2021
Hate versus Politics: Detection of Hate against Policy makers in Italian
  tweets
Hate versus Politics: Detection of Hate against Policy makers in Italian tweets
Armend Duzha
Cristiano Casadei
Michael Tosi
Fabio Celli
71
6
0
12 Jul 2021
Zero-shot Visual Question Answering using Knowledge Graph
Zero-shot Visual Question Answering using Knowledge Graph
Zhuo Chen
Jiaoyan Chen
Yuxia Geng
Jeff Z. Pan
Zonggang Yuan
Huajun Chen
87
71
0
12 Jul 2021
BERT-like Pre-training for Symbolic Piano Music Classification Tasks
BERT-like Pre-training for Symbolic Piano Music Classification Tasks
Yi-Hui Chou
I-Chun Chen
Chin-Jui Chang
Joann Ching
Yi-Hsuan Yang
102
25
0
12 Jul 2021
Split, embed and merge: An accurate table structure recognizer
Split, embed and merge: An accurate table structure recognizer
Zhenrong Zhang
Jianshu Zhang
Jun Du
LMTD
187
62
0
12 Jul 2021
Sliding Spectrum Decomposition for Diversified Recommendation
Sliding Spectrum Decomposition for Diversified Recommendation
Yanhua Huang
Weikun Wang
Lei Zhang
Ruiwen Xu
73
43
0
12 Jul 2021
Delta Sampling R-BERT for limited data and low-light action recognition
Delta Sampling R-BERT for limited data and low-light action recognition
Sanchit Hira
Ritwik Das
Abhinav Modi
D. Pakhomov
111
17
0
12 Jul 2021
Legal Judgment Prediction with Multi-Stage CaseRepresentation Learning
  in the Real Court Setting
Legal Judgment Prediction with Multi-Stage CaseRepresentation Learning in the Real Court Setting
Luyao Ma
Yating Zhang
Tianyi Wang
Xiaozhong Liu
Wei Ye
Changlong Sun
Shikun Zhang
ELMAILaw
99
59
0
12 Jul 2021
TransClaw U-Net: Claw U-Net with Transformers for Medical Image
  Segmentation
TransClaw U-Net: Claw U-Net with Transformers for Medical Image Segmentation
Yao Chang
Menghan Hu
Zhai Guangtao
Xiao-Ping Zhang
MedImViT
129
100
0
12 Jul 2021
A Systematic Literature Review of Automated ICD Coding and
  Classification Systems using Discharge Summaries
A Systematic Literature Review of Automated ICD Coding and Classification Systems using Discharge Summaries
R. Kaur
J. A. Ginige
O. Obst
74
23
0
12 Jul 2021
Transformers with multi-modal features and post-fusion context for
  e-commerce session-based recommendation
Transformers with multi-modal features and post-fusion context for e-commerce session-based recommendation
Gabriel de Souza P. Moreira
Sara Rabhi
Ronay Ak
Md Yasin Kabir
Even Oldridge
64
28
0
11 Jul 2021
"A Virus Has No Religion": Analyzing Islamophobia on Twitter During the
  COVID-19 Outbreak
"A Virus Has No Religion": Analyzing Islamophobia on Twitter During the COVID-19 Outbreak
Mohit Chandra
Manvith Reddy
Shradha Sehgal
Saurabh Gupta
Arun Balaji Buduru
Ponnurangam Kumaraguru
79
42
0
11 Jul 2021
Improving Low-resource Reading Comprehension via Cross-lingual
  Transposition Rethinking
Improving Low-resource Reading Comprehension via Cross-lingual Transposition Rethinking
Gaochen Wu
Bin Xu
Yuxin Qin
Fei Kong
Bangchang Liu
Hongwen Zhao
Dejie Chang
134
3
0
11 Jul 2021
Noise Stability Regularization for Improving BERT Fine-tuning
Noise Stability Regularization for Improving BERT Fine-tuning
Hang Hua
Xingjian Li
Dejing Dou
Chengzhong Xu
Jiebo Luo
79
45
0
10 Jul 2021
Variational Information Bottleneck for Effective Low-resource Audio
  Classification
Variational Information Bottleneck for Effective Low-resource Audio Classification
Shijing Si
Jianzong Wang
Huiming Sun
Jianhan Wu
Chuan Zhang
Xiaoyang Qu
Ning Cheng
Lei Chen
Jing Xiao
38
13
0
10 Jul 2021
Local-to-Global Self-Attention in Vision Transformers
Local-to-Global Self-Attention in Vision Transformers
Jinpeng Li
Yichao Yan
Tianran Ouyang
Xiaokang Yang
Ling Shao
ViT
64
29
0
10 Jul 2021
Layer-wise Analysis of a Self-supervised Speech Representation Model
Layer-wise Analysis of a Self-supervised Speech Representation Model
Ankita Pasad
Ju-Chieh Chou
Karen Livescu
SSL
156
308
0
10 Jul 2021
Transformer-Based Behavioral Representation Learning Enables Transfer
  Learning for Mobile Sensing in Small Datasets
Transformer-Based Behavioral Representation Learning Enables Transfer Learning for Mobile Sensing in Small Datasets
Michael Merrill
Tim Althoff
AI4TSMUMedIm
58
5
0
09 Jul 2021
An Initial Investigation of Non-Native Spoken Question-Answering
An Initial Investigation of Non-Native Spoken Question-Answering
V. Raina
Mark Gales
66
1
0
09 Jul 2021
Accuracy on the Line: On the Strong Correlation Between
  Out-of-Distribution and In-Distribution Generalization
Accuracy on the Line: On the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization
John Miller
Rohan Taori
Aditi Raghunathan
Shiori Sagawa
Pang Wei Koh
Vaishaal Shankar
Percy Liang
Y. Carmon
Ludwig Schmidt
OODDOOD
173
278
0
09 Jul 2021
ViTGAN: Training GANs with Vision Transformers
ViTGAN: Training GANs with Vision Transformers
Kwonjoon Lee
Huiwen Chang
Lu Jiang
Han Zhang
Zhuowen Tu
Ce Liu
ViT
101
186
0
09 Jul 2021
Can Deep Neural Networks Predict Data Correlations from Column Names?
Can Deep Neural Networks Predict Data Correlations from Column Names?
Immanuel Trummer
81
8
0
09 Jul 2021
Redescription Model Mining
Redescription Model Mining
Felix I. Stamm
Martin Becker
M. Strohmaier
Florian Lemmerich
26
1
0
09 Jul 2021
Form2Seq : A Framework for Higher-Order Form Structure Extraction
Form2Seq : A Framework for Higher-Order Form Structure Extraction
Milan Aggarwal
Hiresh Gupta
Mausoom Sarkar
Balaji Krishnamurthy
3DV
68
24
0
09 Jul 2021
Bib2Auth: Deep Learning Approach for Author Disambiguation using
  Bibliographic Data
Bib2Auth: Deep Learning Approach for Author Disambiguation using Bibliographic Data
Zeyd Boukhers
N. Bahubali
Abinaya Chandrasekaran
Adarsh Anand
Soniya Manchenahalli Gnanendra Prasadand
Sriram Aralappa
55
2
0
09 Jul 2021
Benchmarking for Biomedical Natural Language Processing Tasks with a
  Domain Specific ALBERT
Benchmarking for Biomedical Natural Language Processing Tasks with a Domain Specific ALBERT
Usman Naseem
A. Dunn
Matloob Khushi
Jinman Kim
OODLM&MAAI4MH
94
43
0
09 Jul 2021
ABD-Net: Attention Based Decomposition Network for 3D Point Cloud
  Decomposition
ABD-Net: Attention Based Decomposition Network for 3D Point Cloud Decomposition
Siddharth Katageri
S. V. Kudari
Akshaykumar Gunari
R. Tabib
U. Mudenagudi
3DPC
60
5
0
09 Jul 2021
UniRE: A Unified Label Space for Entity Relation Extraction
UniRE: A Unified Label Space for Entity Relation Extraction
Yijun Wang
Changzhi Sun
Yuanbin Wu
Hao Zhou
Lei Li
Junchi Yan
82
116
0
09 Jul 2021
Previous
123...318319320...473474475
Next