ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 23,699 papers shown
Title
MDMMT: Multidomain Multimodal Transformer for Video Retrieval
MDMMT: Multidomain Multimodal Transformer for Video Retrieval
Maksim Dzabraev
M. Kalashnikov
Stepan Alekseevich Komkov
Aleksandr Petiushko
101
134
0
19 Mar 2021
ConViT: Improving Vision Transformers with Soft Convolutional Inductive
  Biases
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
Stéphane dÁscoli
Hugo Touvron
Matthew L. Leavitt
Ari S. Morcos
Giulio Biroli
Levent Sagun
ViT
147
836
0
19 Mar 2021
Interpretable Deep Learning: Interpretation, Interpretability,
  Trustworthiness, and Beyond
Interpretable Deep Learning: Interpretation, Interpretability, Trustworthiness, and Beyond
Xuhong Li
Haoyi Xiong
Xingjian Li
Xuanyu Wu
Xiao Zhang
Ji Liu
Jiang Bian
Dejing Dou
AAMLFaMLXAIHAI
84
344
0
19 Mar 2021
Masked Conditional Random Fields for Sequence Labeling
Masked Conditional Random Fields for Sequence Labeling
Tianwen Wei
Jianwei Qi
Shenghuang He
Songtao Sun
53
19
0
19 Mar 2021
Cost-effective Deployment of BERT Models in Serverless Environment
Cost-effective Deployment of BERT Models in Serverless Environment
Katarína Benesová
Andrej Svec
Marek Suppa
63
2
0
19 Mar 2021
API2Com: On the Improvement of Automatically Generated Code Comments
  Using API Documentations
API2Com: On the Improvement of Automatically Generated Code Comments Using API Documentations
Ramin Shahbazi
Rishab Sharma
Fatemeh H. Fard
92
26
0
19 Mar 2021
GPNAS: A Neural Network Architecture Search Framework Based on Graphical
  Predictor
GPNAS: A Neural Network Architecture Search Framework Based on Graphical Predictor
Dige Ai
Hong Zhang
AI4CE
53
0
0
19 Mar 2021
Scalable Vision Transformers with Hierarchical Pooling
Scalable Vision Transformers with Hierarchical Pooling
Zizheng Pan
Bohan Zhuang
Jing Liu
Haoyu He
Jianfei Cai
ViT
95
130
0
19 Mar 2021
Boosting Adversarial Transferability through Enhanced Momentum
Boosting Adversarial Transferability through Enhanced Momentum
Xiaosen Wang
Jiadong Lin
Han Hu
Jingdong Wang
Kun He
AAML
119
77
0
19 Mar 2021
Extractive Summarization of Call Transcripts
Extractive Summarization of Call Transcripts
Pratik K. Biswas
Aleksandr Iakubovich
57
10
0
19 Mar 2021
UNETR: Transformers for 3D Medical Image Segmentation
UNETR: Transformers for 3D Medical Image Segmentation
Ali Hatamizadeh
Yucheng Tang
Vishwesh Nath
Dong Yang
Andriy Myronenko
Bennett Landman
H. Roth
Daguang Xu
ViTMedIm
210
1,639
0
18 Mar 2021
Refining Language Models with Compositional Explanations
Refining Language Models with Compositional Explanations
Huihan Yao
Ying Chen
Qinyuan Ye
Xisen Jin
Xiang Ren
91
36
0
18 Mar 2021
GPT Understands, Too
GPT Understands, Too
Xiao Liu
Yanan Zheng
Zhengxiao Du
Ming Ding
Yujie Qian
Zhilin Yang
Jie Tang
VLM
210
1,188
0
18 Mar 2021
GLM: General Language Model Pretraining with Autoregressive Blank
  Infilling
GLM: General Language Model Pretraining with Autoregressive Blank Infilling
Zhengxiao Du
Yujie Qian
Xiao Liu
Ming Ding
J. Qiu
Zhilin Yang
Jie Tang
BDLAI4CE
174
1,568
0
18 Mar 2021
Structure Inducing Pre-Training
Structure Inducing Pre-Training
Matthew B. A. McDermott
Brendan Yap
Peter Szolovits
Marinka Zitnik
110
21
0
18 Mar 2021
Contextual Biasing of Language Models for Speech Recognition in Goal-Oriented Conversational Agents
Ashish Shenoy
S. Bodapati
Katrin Kirchhoff
45
5
0
18 Mar 2021
Modeling the Second Player in Distributionally Robust Optimization
Modeling the Second Player in Distributionally Robust Optimization
Paul Michel
Tatsunori Hashimoto
Graham Neubig
83
33
0
18 Mar 2021
Space-Time Crop & Attend: Improving Cross-modal Video Representation
  Learning
Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning
Mandela Patrick
Yuki M. Asano
Bernie Huang
Ishan Misra
Florian Metze
Joao Henriques
Andrea Vedaldi
AI4TS
104
35
0
18 Mar 2021
Let-Mi: An Arabic Levantine Twitter Dataset for Misogynistic Language
Let-Mi: An Arabic Levantine Twitter Dataset for Misogynistic Language
Hala Mulki
Bilal Ghanem
87
41
0
18 Mar 2021
On Semantic Similarity in Video Retrieval
On Semantic Similarity in Video Retrieval
Michael Wray
Hazel Doughty
Dima Damen
99
69
0
18 Mar 2021
Constructive and Toxic Speech Detection for Open-domain Social Media
  Comments in Vietnamese
Constructive and Toxic Speech Detection for Open-domain Social Media Comments in Vietnamese
Luan Thanh Nguyen
Kiet Van Nguyen
Ngan Luu-Thuy Nguyen
91
22
0
18 Mar 2021
Model Extraction and Adversarial Transferability, Your BERT is
  Vulnerable!
Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!
Xuanli He
Lingjuan Lyu
Xingliang Yuan
Lichao Sun
MIACVSILM
102
96
0
18 Mar 2021
You Only Group Once: Efficient Point-Cloud Processing with Token
  Representation and Relation Inference Module
You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module
Chenfeng Xu
Bohan Zhai
Bichen Wu
Tian Li
Wei Zhan
Peter Vajda
Kurt Keutzer
Masayoshi Tomizuka
ViT
94
23
0
18 Mar 2021
Hierarchical Attention-based Age Estimation and Bias Estimation
Hierarchical Attention-based Age Estimation and Bias Estimation
Shakediel Hiba
Y. Keller
CVBM
82
11
0
17 Mar 2021
Self-Supervised Learning of Audio Representations from Permutations with
  Differentiable Ranking
Self-Supervised Learning of Audio Representations from Permutations with Differentiable Ranking
Andrew N. Carr
Quentin Berthet
Mathieu Blondel
O. Teboul
Neil Zeghidour
SSL
80
25
0
17 Mar 2021
Large-Scale Zero-Shot Image Classification from Rich and Diverse Textual
  Descriptions
Large-Scale Zero-Shot Image Classification from Rich and Diverse Textual Descriptions
Sebastian Bujwid
Josephine Sullivan
VLM
141
29
0
17 Mar 2021
Set-to-Sequence Methods in Machine Learning: a Review
Set-to-Sequence Methods in Machine Learning: a Review
Mateusz Jurewicz
Leon Derczynski
BDL
65
10
0
17 Mar 2021
UniParma at SemEval-2021 Task 5: Toxic Spans Detection Using
  CharacterBERT and Bag-of-Words Model
UniParma at SemEval-2021 Task 5: Toxic Spans Detection Using CharacterBERT and Bag-of-Words Model
Akbar Karimi
L. Rossi
Andrea Prati
124
4
0
17 Mar 2021
SILT: Efficient transformer training for inter-lingual inference
SILT: Efficient transformer training for inter-lingual inference
Javier Huertas-Tato
Alejandro Martín
David Camacho
61
11
0
17 Mar 2021
Code Word Detection in Fraud Investigations using a Deep-Learning
  Approach
Code Word Detection in Fraud Investigations using a Deep-Learning Approach
Y. Zee
J. Scholtes
Marcel Westerhoud
Julien Rossi
8
0
0
17 Mar 2021
On the Role of Images for Analyzing Claims in Social Media
On the Role of Images for Analyzing Claims in Social Media
Gullal Singh Cheema
Sherzod Hakimov
Eric Müller-Budack
Ralph Ewerth
116
10
0
17 Mar 2021
Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots
Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots
Samson Tan
Shafiq Joty
AAML
98
36
0
17 Mar 2021
ENCONTER: Entity Constrained Progressive Sequence Generation via
  Insertion-based Transformer
ENCONTER: Entity Constrained Progressive Sequence Generation via Insertion-based Transformer
Lee-Hsun Hsieh
Yang-Yin Lee
Ee-Peng Lim
129
2
0
17 Mar 2021
Towards Few-Shot Fact-Checking via Perplexity
Towards Few-Shot Fact-Checking via Perplexity
Nayeon Lee
Yejin Bang
Andrea Madotto
Madian Khabsa
Pascale Fung
AAML
54
93
0
17 Mar 2021
Dialogue History Matters! Personalized Response Selectionin Multi-turn
  Retrieval-based Chatbots
Dialogue History Matters! Personalized Response Selectionin Multi-turn Retrieval-based Chatbots
Juntao Li
Chang Liu
Chongyang Tao
Zhangming Chan
Dongyan Zhao
Min Zhang
Rui Yan
57
25
0
17 Mar 2021
Investigating Monolingual and Multilingual BERTModels for Vietnamese
  Aspect Category Detection
Investigating Monolingual and Multilingual BERTModels for Vietnamese Aspect Category Detection
D. Thin
Lac Si Le
V. Hoang
Ngan Luu-Thuy Nguyen
77
10
0
17 Mar 2021
OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs
OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs
Weihua Hu
Matthias Fey
Hongyu Ren
Maho Nakata
Yuxiao Dong
J. Leskovec
AI4CE
148
415
0
17 Mar 2021
Structural Adapters in Pretrained Language Models for AMR-to-text
  Generation
Structural Adapters in Pretrained Language Models for AMR-to-text Generation
Leonardo F. R. Ribeiro
Yue Zhang
Iryna Gurevych
100
72
0
16 Mar 2021
Hebbian Semi-Supervised Learning in a Sample Efficiency Setting
Hebbian Semi-Supervised Learning in a Sample Efficiency Setting
Gabriele Lagani
Fabrizio Falchi
Claudio Gennaro
Giuseppe Amato
SSL
79
22
0
16 Mar 2021
Reweighting Augmented Samples by Minimizing the Maximal Expected Loss
Reweighting Augmented Samples by Minimizing the Maximal Expected Loss
Mingyang Yi
Lu Hou
Lifeng Shang
Xin Jiang
Qun Liu
Zhi-Ming Ma
135
20
0
16 Mar 2021
KGSynNet: A Novel Entity Synonyms Discovery Framework with Knowledge
  Graph
KGSynNet: A Novel Entity Synonyms Discovery Framework with Knowledge Graph
Yiying Yang
Xi Yin
Haiqing Yang
Xingjian Fei
Hao Peng
Kaijie Zhou
Kunfeng Lai
Jianping Shen
80
14
0
16 Mar 2021
LabelGit: A Dataset for Software Repositories Classification using
  Attributed Dependency Graphs
LabelGit: A Dataset for Software Repositories Classification using Attributed Dependency Graphs
Cezar Sas
A. Capiluppi
31
2
0
16 Mar 2021
Learned Gradient Compression for Distributed Deep Learning
Learned Gradient Compression for Distributed Deep Learning
L. Abrahamyan
Yiming Chen
Giannis Bekoulis
Nikos Deligiannis
108
46
0
16 Mar 2021
Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual
  Transfer of Vision-Language Models
Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Po-Yao (Bernie) Huang
Mandela Patrick
Junjie Hu
Graham Neubig
Florian Metze
Alexander G. Hauptmann
MLLMVLM
113
57
0
16 Mar 2021
Robustly Optimized and Distilled Training for Natural Language
  Understanding
Robustly Optimized and Distilled Training for Natural Language Understanding
Haytham ElFadeel
Stanislav Peshterliev
VLMOffRL
37
1
0
16 Mar 2021
Predicting Opioid Use Disorder from Longitudinal Healthcare Data using
  Multi-stream Transformer
Predicting Opioid Use Disorder from Longitudinal Healthcare Data using Multi-stream Transformer
S. Fouladvand
J. Talbert
L. Dwoskin
H. Bush
A. Meadows
Lars E. Peterson
Ramakanth Kavuluru
Jin Chen
91
4
0
16 Mar 2021
LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time
  Image-Text Retrieval
LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval
Siqi Sun
Yen-Chun Chen
Linjie Li
Shuohang Wang
Yuwei Fang
Jingjing Liu
VLM
89
84
0
16 Mar 2021
dictNN: A Dictionary-Enhanced CNN Approach for Classifying Hate Speech
  on Twitter
dictNN: A Dictionary-Enhanced CNN Approach for Classifying Hate Speech on Twitter
Maximilian Kupi
M. Bodnar
Nikolas Schmidt
Carlos Eduardo Posada
61
4
0
16 Mar 2021
Embedding Code Contexts for Cryptographic API Suggestion:New
  Methodologies and Comparisons
Embedding Code Contexts for Cryptographic API Suggestion:New Methodologies and Comparisons
Ya Xiao
Salman Ahmed
Wen-Kai Song
Xinyang Ge
Bimal Viswanath
D. Yao
40
4
0
15 Mar 2021
The Effect of Domain and Diacritics in Yorùbá-English Neural Machine
  Translation
The Effect of Domain and Diacritics in Yorùbá-English Neural Machine Translation
David Ifeoluwa Adelani
Dana Ruiter
Jesujoba Oluwadara Alabi
Damilola Adebonojo
Adesina Ayeni
Mofetoluwa Adeyemi
Ayodele Awokoya
C. España-Bonet
100
42
0
15 Mar 2021
Previous
123...354355356...472473474
Next