ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 23,688 papers shown
Title
Beyond Nyströmformer -- Approximation of self-attention by Spectral
  Shifting
Beyond Nyströmformer -- Approximation of self-attention by Spectral Shifting
Madhusudan Verma
64
1
0
09 Mar 2021
BERTese: Learning to Speak to BERT
BERTese: Learning to Speak to BERT
Adi Haviv
Jonathan Berant
Amir Globerson
132
124
0
09 Mar 2021
Deep Learning for Android Malware Defenses: a Systematic Literature
  Review
Deep Learning for Android Malware Defenses: a Systematic Literature Review
Yue Liu
Chakkrit Tantithamthavorn
Li Li
Yepang Liu
AAML
88
81
0
09 Mar 2021
Pretrained Transformers as Universal Computation Engines
Pretrained Transformers as Universal Computation Engines
Kevin Lu
Aditya Grover
Pieter Abbeel
Igor Mordatch
92
221
0
09 Mar 2021
Self-supervised Regularization for Text Classification
Self-supervised Regularization for Text Classification
Meng Zhou
Zechen Li
P. Xie
62
16
0
09 Mar 2021
Iterative Shrinking for Referring Expression Grounding Using Deep
  Reinforcement Learning
Iterative Shrinking for Referring Expression Grounding Using Deep Reinforcement Learning
Mingjie Sun
Jimin Xiao
Eng Gee Lim
ObjD
84
35
0
09 Mar 2021
Improving Document-Level Sentiment Classification Using Importance of
  Sentences
Improving Document-Level Sentiment Classification Using Importance of Sentences
Gihyeon Choi
Shinhyeok Oh
H. Kim
66
27
0
09 Mar 2021
AfriVEC: Word Embedding Models for African Languages. Case Study of Fon
  and Nobiin
AfriVEC: Word Embedding Models for African Languages. Case Study of Fon and Nobiin
Bonaventure F. P. Dossou
Mohammed Sabry
87
3
0
08 Mar 2021
Few-Shot Learning of an Interleaved Text Summarization Model by
  Pretraining with Synthetic Data
Few-Shot Learning of an Interleaved Text Summarization Model by Pretraining with Synthetic Data
Sanjeev Kumar Karn
Francine Chen
Yan-Ying Chen
Ulli Waltinger
Hinrich Schütze
AI4TS
42
8
0
08 Mar 2021
Instabilities of Offline RL with Pre-Trained Neural Representation
Instabilities of Offline RL with Pre-Trained Neural Representation
Ruosong Wang
Yifan Wu
Ruslan Salakhutdinov
Sham Kakade
OffRL
165
42
0
08 Mar 2021
Deep Generative Modelling: A Comparative Review of VAEs, GANs,
  Normalizing Flows, Energy-Based and Autoregressive Models
Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models
Sam Bond-Taylor
Adam Leach
Yang Long
Chris G. Willcocks
VLMTPM
200
511
0
08 Mar 2021
Large Pre-trained Language Models Contain Human-like Biases of What is
  Right and Wrong to Do
Large Pre-trained Language Models Contain Human-like Biases of What is Right and Wrong to Do
P. Schramowski
Cigdem Turan
Nico Andersen
Constantin Rothkopf
Kristian Kersting
120
298
0
08 Mar 2021
Meta-Learning with MAML on Trees
Meta-Learning with MAML on Trees
Jezabel R. Garcia
Federica Freddi
F. Liao
Jamie McGowan
Tim Nieradzik
Da-shan Shiu
Ye Tian
A. Bernacchia
49
4
0
08 Mar 2021
Reverse Differentiation via Predictive Coding
Reverse Differentiation via Predictive Coding
Tommaso Salvatori
Yuhang Song
Thomas Lukasiewicz
Rafal Bogacz
Zhenghua Xu
PINN
85
28
0
08 Mar 2021
Anomaly Detection Based on Selection and Weighting in Latent Space
Anomaly Detection Based on Selection and Weighting in Latent Space
Yiwen Liao
Alexander Bartler
Binh Yang
DRL
65
13
0
08 Mar 2021
Behavior From the Void: Unsupervised Active Pre-Training
Behavior From the Void: Unsupervised Active Pre-Training
Hao Liu
Pieter Abbeel
VLMSSL
146
207
0
08 Mar 2021
"Sharks are not the threat humans are": Argument Component Segmentation
  in School Student Essays
"Sharks are not the threat humans are": Argument Component Segmentation in School Student Essays
Tariq Alhindi
Debanjan Ghosh
46
14
0
08 Mar 2021
Split Computing and Early Exiting for Deep Learning Applications: Survey
  and Research Challenges
Split Computing and Early Exiting for Deep Learning Applications: Survey and Research Challenges
Yoshitomo Matsubara
Marco Levorato
Francesco Restuccia
139
215
0
08 Mar 2021
SCNN: Swarm Characteristic Neural Network
SCNN: Swarm Characteristic Neural Network
Nguyen Ha Thanh
Le-Minh Nguyen
30
0
0
08 Mar 2021
LogBERT: Log Anomaly Detection via BERT
LogBERT: Log Anomaly Detection via BERT
Haixuan Guo
Shuhan Yuan
Xintao Wu
95
225
0
07 Mar 2021
Improving Text-to-SQL with Schema Dependency Learning
Improving Text-to-SQL with Schema Dependency Learning
Binyuan Hui
Xiang Shi
Ruiying Geng
Binhua Li
Yongbin Li
Jian Sun
Xiao-Dan Zhu
91
40
0
07 Mar 2021
Automatic Difficulty Classification of Arabic Sentences
Automatic Difficulty Classification of Arabic Sentences
Nouran Khallaf
S. Sharoff
34
13
0
07 Mar 2021
Syntax-BERT: Improving Pre-trained Transformers with Syntax Trees
Syntax-BERT: Improving Pre-trained Transformers with Syntax Trees
Jiangang Bai
Yujing Wang
Yiren Chen
Yaming Yang
Jing Bai
Jiahao Yu
Yunhai Tong
88
104
0
07 Mar 2021
Robust Point Cloud Registration Framework Based on Deep Graph Matching
Robust Point Cloud Registration Framework Based on Deep Graph Matching
Kexue Fu
Shaolei Liu
Xiaoyuan Luo
Manning Wang
3DPC
82
212
0
07 Mar 2021
Changing the Narrative Perspective: From Deictic to Anaphoric Point of
  View
Changing the Narrative Perspective: From Deictic to Anaphoric Point of View
Mike Chen
Razvan Bunescu
38
6
0
06 Mar 2021
Extracting Semantic Process Information from the Natural Language in
  Event Logs
Extracting Semantic Process Information from the Natural Language in Event Logs
Adrian Rebmann
Han van der Aa
21
23
0
06 Mar 2021
Perspectives and Prospects on Transformer Architecture for Cross-Modal
  Tasks with Language and Vision
Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision
Andrew Shin
Masato Ishii
T. Narihira
142
39
0
06 Mar 2021
Measuring Mathematical Problem Solving With the MATH Dataset
Measuring Mathematical Problem Solving With the MATH Dataset
Dan Hendrycks
Collin Burns
Saurav Kadavath
Akul Arora
Steven Basart
Eric Tang
Basel Alomair
Jacob Steinhardt
ReLMFaML
262
2,415
0
05 Mar 2021
OperA: Attention-Regularized Transformers for Surgical Phase Recognition
OperA: Attention-Regularized Transformers for Surgical Phase Recognition
Tobias Czempiel
Magdalini Paschali
D. Ostler
S. T. Kim
Benjamin Busam
Nassir Navab
MedIm
112
89
0
05 Mar 2021
Overcoming Poor Word Embeddings with Word Definitions
Overcoming Poor Word Embeddings with Word Definitions
Christopher Malon
36
3
0
05 Mar 2021
AnswerQuest: A System for Generating Question-Answer Items from
  Multi-Paragraph Documents
AnswerQuest: A System for Generating Question-Answer Items from Multi-Paragraph Documents
Melissa Roemmele
Deep Sidhpura
Steve DeNeefe
Ling Tsou
RALM
41
6
0
05 Mar 2021
MalBERT: Using Transformers for Cybersecurity and Malicious Software
  Detection
MalBERT: Using Transformers for Cybersecurity and Malicious Software Detection
Abir Rahali
M. Akhloufi
94
30
0
05 Mar 2021
A Convolutional Architecture for 3D Model Embedding
A Convolutional Architecture for 3D Model Embedding
Arniel Labrada
B. Bustos
I. Sipiran
3DPC3DV
32
5
0
05 Mar 2021
Fine-tuning Pretrained Multilingual BERT Model for Indonesian
  Aspect-based Sentiment Analysis
Fine-tuning Pretrained Multilingual BERT Model for Indonesian Aspect-based Sentiment Analysis
Annisa Nurul Azhar
M. L. Khodra
46
31
0
05 Mar 2021
A framework for fostering transparency in shared artificial intelligence
  models by increasing visibility of contributions
A framework for fostering transparency in shared artificial intelligence models by increasing visibility of contributions
I. Barclay
Harrison Taylor
Alun D. Preece
Ian J. Taylor
D. Verma
Geeth de Mel
67
13
0
05 Mar 2021
WordBias: An Interactive Visual Tool for Discovering Intersectional
  Biases Encoded in Word Embeddings
WordBias: An Interactive Visual Tool for Discovering Intersectional Biases Encoded in Word Embeddings
Bhavya Ghai
Md. Naimul Hoque
Klaus Mueller
84
26
0
05 Mar 2021
Cycle Self-Training for Domain Adaptation
Cycle Self-Training for Domain Adaptation
Hong Liu
Jianmin Wang
Mingsheng Long
141
180
0
05 Mar 2021
Can Pretext-Based Self-Supervised Learning Be Boosted by Downstream
  Data? A Theoretical Analysis
Can Pretext-Based Self-Supervised Learning Be Boosted by Downstream Data? A Theoretical Analysis
Jiaye Teng
Weiran Huang
Haowei He
SSL
87
12
0
05 Mar 2021
Dual Pointer Network for Fast Extraction of Multiple Relations in a
  Sentence
Dual Pointer Network for Fast Extraction of Multiple Relations in a Sentence
Seongsik Park
H. Kim
56
8
0
05 Mar 2021
Causal Attention for Vision-Language Tasks
Causal Attention for Vision-Language Tasks
Xu Yang
Hanwang Zhang
Guojun Qi
Jianfei Cai
CML
105
158
0
05 Mar 2021
Enhanced Aspect-Based Sentiment Analysis Models with Progressive
  Self-supervised Attention Learning
Enhanced Aspect-Based Sentiment Analysis Models with Progressive Self-supervised Attention Learning
Jinsong Su
Jialong Tang
Hui Jiang
Ziyao Lu
Yubin Ge
Linfeng Song
Deyi Xiong
Le Sun
Jiebo Luo
67
50
0
05 Mar 2021
Attention is Not All You Need: Pure Attention Loses Rank Doubly
  Exponentially with Depth
Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth
Yihe Dong
Jean-Baptiste Cordonnier
Andreas Loukas
184
388
0
05 Mar 2021
Neural model robustness for skill routing in large-scale conversational
  AI systems: A design choice exploration
Neural model robustness for skill routing in large-scale conversational AI systems: A design choice exploration
Han Li
Sunghyun Park
Aswarth Abhilash Dara
Jinseok Nam
Sungjin Lee
Young-Bum Kim
Spyros Matsoukas
R. Sarikaya
73
9
0
04 Mar 2021
A Systematic Evaluation of Transfer Learning and Pseudo-labeling with
  BERT-based Ranking Models
A Systematic Evaluation of Transfer Learning and Pseudo-labeling with BERT-based Ranking Models
Iurii Mokrii
Leonid Boytsov
Pavel Braslavski
90
26
0
04 Mar 2021
Moshpit SGD: Communication-Efficient Decentralized Training on
  Heterogeneous Unreliable Devices
Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices
Max Ryabinin
Eduard A. Gorbunov
Vsevolod Plokhotnyuk
Gennady Pekhimenko
135
36
0
04 Mar 2021
Perceiver: General Perception with Iterative Attention
Perceiver: General Perception with Iterative Attention
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLMViTMDE
218
1,029
0
04 Mar 2021
A Survey on Spoken Language Understanding: Recent Advances and New
  Frontiers
A Survey on Spoken Language Understanding: Recent Advances and New Frontiers
Libo Qin
Tianbao Xie
Wanxiang Che
Ting Liu
VLM
96
97
0
04 Mar 2021
CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image
  Segmentation
CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation
Yutong Xie
Jianpeng Zhang
Chunhua Shen
Yong-quan Xia
ViTMedIm
103
504
0
04 Mar 2021
Hardware Acceleration of Fully Quantized BERT for Efficient Natural
  Language Processing
Hardware Acceleration of Fully Quantized BERT for Efficient Natural Language Processing
Zejian Liu
Gang Li
Jian Cheng
MQ
73
61
0
04 Mar 2021
Contrastive learning of strong-mixing continuous-time stochastic
  processes
Contrastive learning of strong-mixing continuous-time stochastic processes
Bingbin Liu
Pradeep Ravikumar
Andrej Risteski
SSL
59
3
0
03 Mar 2021
Previous
123...356357358...472473474
Next