ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 23,688 papers shown
Title
Dual-view Molecule Pre-training
Dual-view Molecule Pre-training
Jinhua Zhu
Yingce Xia
Tao Qin
Wen-gang Zhou
Houqiang Li
Tie-Yan Liu
AI4CE
116
52
0
17 Jun 2021
Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis
  of Head and Prompt Tuning
Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning
Colin Wei
Sang Michael Xie
Tengyu Ma
156
100
0
17 Jun 2021
Long-Short Temporal Contrastive Learning of Video Transformers
Long-Short Temporal Contrastive Learning of Video Transformers
Jue Wang
Gedas Bertasius
Du Tran
Lorenzo Torresani
VLMViT
153
50
0
17 Jun 2021
Improving DNN Fault Tolerance using Weight Pruning and Differential
  Crossbar Mapping for ReRAM-based Edge AI
Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI
Geng Yuan
Zhiheng Liao
Xiaolong Ma
Yuxuan Cai
Zhenglun Kong
...
Hongwu Peng
Ning Liu
Ao Ren
Jinhui Wang
Yanzhi Wang
AAML
108
33
0
16 Jun 2021
A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection
A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection
Jie Jessie Ren
Stanislav Fort
J. Liu
Abhijit Guha Roy
Shreyas Padhy
Balaji Lakshminarayanan
UQCV
200
227
0
16 Jun 2021
QuantumFed: A Federated Learning Framework for Collaborative Quantum
  Training
QuantumFed: A Federated Learning Framework for Collaborative Quantum Training
Qi Xia
Qun Li
FedML
113
31
0
16 Jun 2021
Specializing Multilingual Language Models: An Empirical Study
Specializing Multilingual Language Models: An Empirical Study
Ethan C. Chau
Noah A. Smith
139
27
0
16 Jun 2021
Disentangling Online Chats with DAG-Structured LSTMs
Disentangling Online Chats with DAG-Structured LSTMs
D. Pappadopulo
Lisa Bauer
M. Farina
Ozan Irsoy
Joey Tianyi Zhou
66
5
0
16 Jun 2021
Invertible Attention
Invertible Attention
Jiajun Zha
Yiran Zhong
Jing Zhang
Leonid Sigal
Liang Zheng
82
7
0
16 Jun 2021
Efficient Deep Learning: A Survey on Making Deep Learning Models
  Smaller, Faster, and Better
Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better
Gaurav Menghani
VLMMedIm
120
391
0
16 Jun 2021
Named Entity Recognition with Small Strongly Labeled and Large Weakly
  Labeled Data
Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data
Haoming Jiang
Danqing Zhang
Tianyu Cao
Bing Yin
T. Zhao
NoLa
82
46
0
16 Jun 2021
A Fair and Comprehensive Comparison of Multimodal Tweet Sentiment
  Analysis Methods
A Fair and Comprehensive Comparison of Multimodal Tweet Sentiment Analysis Methods
Gullal Singh Cheema
Sherzod Hakimov
Eric Müller-Budack
Ralph Ewerth
68
20
0
16 Jun 2021
Eigen Analysis of Self-Attention and its Reconstruction from Partial
  Computation
Eigen Analysis of Self-Attention and its Reconstruction from Partial Computation
Srinadh Bhojanapalli
Ayan Chakrabarti
Himanshu Jain
Sanjiv Kumar
Michal Lukasik
Andreas Veit
70
8
0
16 Jun 2021
To Raise or Not To Raise: The Autonomous Learning Rate Question
To Raise or Not To Raise: The Autonomous Learning Rate Question
Xiaomeng Dong
Tao Tan
Michael Potter
Yun-Chan Tsai
Gaurav Kumar
V. R. Saripalli
Theodore Trafalis
OOD
50
2
0
16 Jun 2021
Input Invex Neural Network
Input Invex Neural Network
Suman Sapkota
Binod Bhattarai
50
4
0
16 Jun 2021
Memorization and Generalization in Neural Code Intelligence Models
Memorization and Generalization in Neural Code Intelligence Models
Md Rafiqul Islam Rabin
Aftab Hussain
Mohammad Amin Alipour
Vincent J. Hellendoorn
TDI
87
43
0
16 Jun 2021
On the proper role of linguistically-oriented deep net analysis in
  linguistic theorizing
On the proper role of linguistically-oriented deep net analysis in linguistic theorizing
Marco Baroni
135
53
0
16 Jun 2021
Alzheimer's Disease Detection from Spontaneous Speech through Combining
  Linguistic Complexity and (Dis)Fluency Features with Pretrained Language
  Models
Alzheimer's Disease Detection from Spontaneous Speech through Combining Linguistic Complexity and (Dis)Fluency Features with Pretrained Language Models
Yu Qiao
Xuefeng Yin
Daniel Wiechmann
E. Kerz
75
23
0
16 Jun 2021
Eider: Empowering Document-level Relation Extraction with Efficient
  Evidence Extraction and Inference-stage Fusion
Eider: Empowering Document-level Relation Extraction with Efficient Evidence Extraction and Inference-stage Fusion
Yiqing Xie
Jiaming Shen
Sha Li
Yuning Mao
Jiawei Han
87
69
0
16 Jun 2021
Semantic sentence similarity: size does not always matter
Semantic sentence similarity: size does not always matter
Danny Merkx
S. Frank
M. Ernestus
46
6
0
16 Jun 2021
From Discourse to Narrative: Knowledge Projection for Event Relation
  Extraction
From Discourse to Narrative: Knowledge Projection for Event Relation Extraction
Jialong Tang
Hongyu Lin
M. Liao
Yaojie Lu
Xianpei Han
Le Sun
Weijian Xie
Jin Xu
67
24
0
16 Jun 2021
Out-of-Scope Intent Detection with Self-Supervision and Discriminative
  Training
Out-of-Scope Intent Detection with Self-Supervision and Discriminative Training
Li-Ming Zhan
Haowen Liang
Bo Liu
Lu Fan
Xiao-Ming Wu
Albert Y. S. Lam
OODD
63
80
0
16 Jun 2021
Coreference-Aware Dialogue Summarization
Coreference-Aware Dialogue Summarization
Zhengyuan Liu
Ke Shi
Nancy F. Chen
82
60
0
16 Jun 2021
Scene Transformer: A unified architecture for predicting multiple agent
  trajectories
Scene Transformer: A unified architecture for predicting multiple agent trajectories
Jiquan Ngiam
Benjamin Caine
Vijay Vasudevan
Zhengdong Zhang
H. Chiang
...
Ashish Venugopal
David J. Weiss
Benjamin Sapp
Zhifeng Chen
Jonathon Shlens
131
168
0
15 Jun 2021
Code to Comment Translation: A Comparative Study on Model Effectiveness
  & Errors
Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors
Junayed Mahmud
Fahim Faisal
Raihan Islam Arnob
Antonios Anastasopoulos
Kevin Moran
127
20
0
15 Jun 2021
Adversarial Attacks on Deep Models for Financial Transaction Records
Adversarial Attacks on Deep Models for Financial Transaction Records
I. Fursov
Matvey Morozov
N. Kaploukhaya
Elizaveta Kovtun
Rodrigo Rivera-Castro
Gleb Gusev
Dmitrii Babaev
Ivan Kireev
Alexey Zaytsev
Evgeny Burnaev
AAML
85
38
0
15 Jun 2021
Development of Quantized DNN Library for Exact Hardware Emulation
Development of Quantized DNN Library for Exact Hardware Emulation
M. Kiyama
Motoki Amagasaki
M. Iida
MQ
35
0
0
15 Jun 2021
First Place Solution of KDD Cup 2021 & OGB Large-Scale Challenge Graph
  Prediction Track
First Place Solution of KDD Cup 2021 & OGB Large-Scale Challenge Graph Prediction Track
Chengxuan Ying
Mingqi Yang
Shuxin Zheng
Guolin Ke
Shengjie Luo
Tianle Cai
Chenglin Wu
Yuxin Wang
Yanming Shen
Di He
49
11
0
15 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
411
2,858
0
15 Jun 2021
Interpretable Self-supervised Multi-task Learning for COVID-19
  Information Retrieval and Extraction
Interpretable Self-supervised Multi-task Learning for COVID-19 Information Retrieval and Extraction
Nima Ebadi
Peyman Najafirad
41
0
0
15 Jun 2021
PairConnect: A Compute-Efficient MLP Alternative to Attention
PairConnect: A Compute-Efficient MLP Alternative to Attention
Zhaozhuo Xu
Minghao Yan
Junyan Zhang
Anshumali Shrivastava
52
1
0
15 Jun 2021
Consistency Regularization for Cross-Lingual Fine-Tuning
Consistency Regularization for Cross-Lingual Fine-Tuning
Bo Zheng
Li Dong
Shaohan Huang
Wenhui Wang
Zewen Chi
Saksham Singhal
Wanxiang Che
Ting Liu
Xia Song
Furu Wei
62
58
0
15 Jun 2021
Question Answering Infused Pre-training of General-Purpose
  Contextualized Representations
Question Answering Infused Pre-training of General-Purpose Contextualized Representations
Robin Jia
M. Lewis
Luke Zettlemoyer
82
29
0
15 Jun 2021
Direction is what you need: Improving Word Embedding Compression in
  Large Language Models
Direction is what you need: Improving Word Embedding Compression in Large Language Models
Klaudia Bałazy
Mohammadreza Banaei
R. Lebret
Jacek Tabor
Karl Aberer
55
7
0
15 Jun 2021
Evaluating Modules in Graph Contrastive Learning
Evaluating Modules in Graph Contrastive Learning
Ganqu Cui
Y. Du
Cheng Yang
Jie Zhou
Liang Xu
Xing Zhou
Lifeng Wang
Zhiyuan Liu
59
4
0
15 Jun 2021
CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
Ningyu Zhang
Mosha Chen
Zhen Bi
Xiaozhuan Liang
Lei Li
...
Jun Yan
Hongying Zan
Kunli Zhang
Buzhou Tang
Qingcai Chen
LM&MAELM
93
193
0
15 Jun 2021
Multivariate Business Process Representation Learning utilizing Gramian
  Angular Fields and Convolutional Neural Networks
Multivariate Business Process Representation Learning utilizing Gramian Angular Fields and Convolutional Neural Networks
P. Pfeiffer
Johannes Lahann
Peter Fettke
SSL
66
17
0
15 Jun 2021
Zero-shot Node Classification with Decomposed Graph Prototype Network
Zero-shot Node Classification with Decomposed Graph Prototype Network
Zheng Wang
Jialong Wang
Yuchen Guo
Zhiguo Gong
74
37
0
15 Jun 2021
Unsupervised Abstractive Opinion Summarization by Generating Sentences
  with Tree-Structured Topic Guidance
Unsupervised Abstractive Opinion Summarization by Generating Sentences with Tree-Structured Topic Guidance
Masaru Isonuma
Junichiro Mori
Danushka Bollegala
Ichiro Sakata
61
27
0
15 Jun 2021
Simple GNN Regularisation for 3D Molecular Property Prediction & Beyond
Simple GNN Regularisation for 3D Molecular Property Prediction & Beyond
Jonathan Godwin
Michael Schaarschmidt
Alex Gaunt
Alvaro Sanchez-Gonzalez
Yulia Rubanova
Petar Velivcković
J. Kirkpatrick
Peter W. Battaglia
109
60
0
15 Jun 2021
Incorporating Word Sense Disambiguation in Neural Language Models
Incorporating Word Sense Disambiguation in Neural Language Models
Jan Philip Wahle
Terry Ruas
Norman Meuschke
Bela Gipp
67
11
0
15 Jun 2021
Deriving Word Vectors from Contextualized Language Models using
  Topic-Aware Mention Selection
Deriving Word Vectors from Contextualized Language Models using Topic-Aware Mention Selection
Yixiao Wang
Zied Bouraoui
Luis Espinosa Anke
Steven Schockaert
46
4
0
15 Jun 2021
Modeling morphology with Linear Discriminative Learning: considerations
  and design choices
Modeling morphology with Linear Discriminative Learning: considerations and design choices
Maria Heitmeier
Yu-Ying Chuang
Raunern
86
20
0
15 Jun 2021
BERT Embeddings for Automatic Readability Assessment
BERT Embeddings for Automatic Readability Assessment
Joseph Marvin Imperial
67
39
0
15 Jun 2021
An Automated Quality Evaluation Framework of Psychotherapy Conversations
  with Local Quality Estimates
An Automated Quality Evaluation Framework of Psychotherapy Conversations with Local Quality Estimates
Zhuohao Chen
Nikolaos Flemotomos
Karan Singla
Torrey A. Creed
David C. Atkins
Shrikanth Narayanan
68
5
0
15 Jun 2021
Vision-Language Navigation with Random Environmental Mixup
Vision-Language Navigation with Random Environmental Mixup
Chong Liu
Fengda Zhu
Xiaojun Chang
Xiaodan Liang
Zongyuan Ge
Yi-Dong Shen
LM&Ro
150
88
0
15 Jun 2021
Teacher-Student MixIT for Unsupervised and Semi-supervised Speech
  Separation
Teacher-Student MixIT for Unsupervised and Semi-supervised Speech Separation
Jisi Zhang
Catalin Zorila
R. Doddipatla
Jon Barker
56
22
0
15 Jun 2021
Assessing the Use of Prosody in Constituency Parsing of Imperfect
  Transcripts
Assessing the Use of Prosody in Constituency Parsing of Imperfect Transcripts
Trang Tran
Mari Ostendorf
53
5
0
14 Jun 2021
Can BERT Dig It? -- Named Entity Recognition for Information Retrieval
  in the Archaeology Domain
Can BERT Dig It? -- Named Entity Recognition for Information Retrieval in the Archaeology Domain
Alex Brandsen
Suzan Verberne
K. Lambers
M. Wansleeben
65
38
0
14 Jun 2021
Improving Paraphrase Detection with the Adversarial Paraphrasing Task
Improving Paraphrase Detection with the Adversarial Paraphrasing Task
Animesh Nighojkar
John Licato
70
39
0
14 Jun 2021
Previous
123...324325326...472473474
Next