ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 23,641 papers shown
Title
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for
  Image Captioning
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning
Jun Chen
Han Guo
Kai Yi
Boyang Albert Li
Mohamed Elhoseiny
VLM
166
227
0
20 Feb 2021
On Calibration and Out-of-domain Generalization
On Calibration and Out-of-domain Generalization
Yoav Wald
Amir Feder
D. Greenfeld
Uri Shalit
OODD
151
158
0
20 Feb 2021
Evolving Attention with Residual Convolutions
Evolving Attention with Residual Convolutions
Yujing Wang
Yaming Yang
Jiangang Bai
Mingliang Zhang
Jing Bai
Jiahao Yu
Ce Zhang
Gao Huang
Yunhai Tong
ViT
112
34
0
20 Feb 2021
Predicting times of waiting on red signals using BERT
Predicting times of waiting on red signals using BERT
Witold Szejgis
Anna Warno
P. Góra
32
1
0
20 Feb 2021
Multilingual Answer Sentence Reranking via Automatically Translated Data
Multilingual Answer Sentence Reranking via Automatically Translated Data
Thuy Vu
Alessandro Moschitti
66
5
0
20 Feb 2021
Entity Structure Within and Throughout: Modeling Mention Dependencies
  for Document-Level Relation Extraction
Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction
Benfeng Xu
Quan Wang
Yajuan Lyu
Yong Zhu
Zhendong Mao
109
170
0
20 Feb 2021
Hard-Attention for Scalable Image Classification
Hard-Attention for Scalable Image Classification
Athanasios Papadopoulos
Pawel Korus
N. Memon
125
25
0
20 Feb 2021
Latent Variable Sequential Set Transformers For Joint Multi-Agent Motion
  Prediction
Latent Variable Sequential Set Transformers For Joint Multi-Agent Motion Prediction
Roger Girgis
Florian Golemo
Felipe Codevilla
Martin Weiss
Jim Aldon D’Souza
Samira Ebrahimi Kahou
Felix Heide
C. Pal
94
133
0
19 Feb 2021
Hate-Alert@DravidianLangTech-EACL2021: Ensembling strategies for
  Transformer-based Offensive language Detection
Hate-Alert@DravidianLangTech-EACL2021: Ensembling strategies for Transformer-based Offensive language Detection
Debjoy Saha
Naman Paharia
Debajit Chakraborty
Punyajoy Saha
Animesh Mukherjee
39
38
0
19 Feb 2021
Analyzing Curriculum Learning for Sentiment Analysis along Task
  Difficulty, Pacing and Visualization Axes
Analyzing Curriculum Learning for Sentiment Analysis along Task Difficulty, Pacing and Visualization Axes
Anvesh Rao Vijjini
Kaveri Anuranjana
R. Mamidi
70
3
0
19 Feb 2021
Towards Emotion Recognition in Hindi-English Code-Mixed Data: A
  Transformer Based Approach
Towards Emotion Recognition in Hindi-English Code-Mixed Data: A Transformer Based Approach
Anshul Wadhawan
Akshita Aggarwal
66
32
0
19 Feb 2021
Back to Prior Knowledge: Joint Event Causality Extraction via
  Convolutional Semantic Infusion
Back to Prior Knowledge: Joint Event Causality Extraction via Convolutional Semantic Infusion
Zijian Wang
Hao Wang
Xiangfeng Luo
Jianqi Gao
25
4
0
19 Feb 2021
End-to-End Neural Systems for Automatic Children Speech Recognition: An
  Empirical Study
End-to-End Neural Systems for Automatic Children Speech Recognition: An Empirical Study
Prashanth Gurunath Shivakumar
Shrikanth Narayanan
58
54
0
19 Feb 2021
Alternate Endings: Improving Prosody for Incremental Neural TTS with
  Predicted Future Text Input
Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input
Brooke Stephenson
Thomas Hueber
Laurent Girin
Laurent Besacier
91
10
0
19 Feb 2021
An Empirical Study on Measuring the Similarity of Sentential Arguments
  with Language Model Domain Adaptation
An Empirical Study on Measuring the Similarity of Sentential Arguments with Language Model Domain Adaptation
Yujin Baek
Sang-gyu Seo
36
0
0
19 Feb 2021
Scaling Creative Inspiration with Fine-Grained Functional Aspects of
  Ideas
Scaling Creative Inspiration with Fine-Grained Functional Aspects of Ideas
Tom Hope
Ronen Tamari
Hyeonsu B Kang
Daniel Hershcovich
Joel Chan
A. Kittur
Dafna Shahaf
42
19
0
19 Feb 2021
Symplectic Adjoint Method for Exact Gradient of Neural ODE with Minimal
  Memory
Symplectic Adjoint Method for Exact Gradient of Neural ODE with Minimal Memory
Takashi Matsubara
Yuto Miyatake
Takaharu Yaguchi
71
23
0
19 Feb 2021
WebRED: Effective Pretraining And Finetuning For Relation Extraction On
  The Web
WebRED: Effective Pretraining And Finetuning For Relation Extraction On The Web
Róbert Ormándi
Mohammad Saleh
Erin Winter
Vinay Rao
51
11
0
18 Feb 2021
MUDES: Multilingual Detection of Offensive Spans
MUDES: Multilingual Detection of Offensive Spans
Tharindu Ranasinghe
Marcos Zampieri
83
41
0
18 Feb 2021
A Systematic Review of Natural Language Processing Applied to Radiology
  Reports
A Systematic Review of Natural Language Processing Applied to Radiology Reports
Arlene Casey
Emma Davidson
Michael Poon
Hang Dong
Daniel Duma
...
Víctor Suárez-Paniagua
Richard Tobin
William Whiteley
Honghan Wu
Beatrice Alex
AI4CE
41
150
0
18 Feb 2021
Deep Learning for Suicide and Depression Identification with
  Unsupervised Label Correction
Deep Learning for Suicide and Depression Identification with Unsupervised Label Correction
Ayaan Haque
V. Reddi
Tyler Giallanza
NoLa
62
60
0
18 Feb 2021
Meta-Transfer Learning for Low-Resource Abstractive Summarization
Meta-Transfer Learning for Low-Resource Abstractive Summarization
Yi-Syuan Chen
Hong-Han Shuai
CLLOffRL
103
39
0
18 Feb 2021
Training Large-Scale News Recommenders with Pretrained Language Models
  in the Loop
Training Large-Scale News Recommenders with Pretrained Language Models in the Loop
Shitao Xiao
Zheng Liu
Yingxia Shao
Tao Di
Xing Xie
VLMAIFin
186
42
0
18 Feb 2021
Composable Generative Models
Composable Generative Models
Johan Leduc
Nicolas Grislain
SyDa
84
4
0
18 Feb 2021
Less is More: Pre-train a Strong Text Encoder for Dense Retrieval Using
  a Weak Decoder
Less is More: Pre-train a Strong Text Encoder for Dense Retrieval Using a Weak Decoder
Shuqi Lu
Di He
Chenyan Xiong
Guolin Ke
Waleed Malik
Zhicheng Dou
Paul N. Bennett
Tie-Yan Liu
Arnold Overwijk
RALM
124
11
0
18 Feb 2021
Entity-level Factual Consistency of Abstractive Text Summarization
Entity-level Factual Consistency of Abstractive Text Summarization
Feng Nan
Ramesh Nallapati
Zhiguo Wang
Cicero Nogueira dos Santos
Henghui Zhu
Dejiao Zhang
Kathleen McKeown
Bing Xiang
HILM
205
161
0
18 Feb 2021
Quiz-Style Question Generation for News Stories
Quiz-Style Question Generation for News Stories
Á. Lelkes
Vinh Q. Tran
Cong Yu
84
42
0
18 Feb 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize
  Long-Tail Visual Concepts
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
573
1,143
0
17 Feb 2021
Towards generalisable hate speech detection: a review on obstacles and
  solutions
Towards generalisable hate speech detection: a review on obstacles and solutions
Wenjie Yin
A. Zubiaga
189
169
0
17 Feb 2021
SciDr at SDU-2020: IDEAS -- Identifying and Disambiguating Everyday
  Acronyms for Scientific Domain
SciDr at SDU-2020: IDEAS -- Identifying and Disambiguating Everyday Acronyms for Scientific Domain
Aadarsh Singh
Priyanshu Kumar
54
9
0
17 Feb 2021
Decoding EEG Brain Activity for Multi-Modal Natural Language Processing
Decoding EEG Brain Activity for Multi-Modal Natural Language Processing
Nora Hollenstein
Cédric Renggli
B. Glaus
Maria Barrett
M. Troendle
N. Langer
Ce Zhang
137
35
0
17 Feb 2021
Open-Retrieval Conversational Machine Reading
Open-Retrieval Conversational Machine Reading
Yifan Gao
Jingjing Li
Chien-Sheng Wu
Michael R. Lyu
Irwin King
124
17
0
17 Feb 2021
First Target and Opinion then Polarity: Enhancing Target-opinion
  Correlation for Aspect Sentiment Triplet Extraction
First Target and Opinion then Polarity: Enhancing Target-opinion Correlation for Aspect Sentiment Triplet Extraction
Lianzhe Huang
Peiyi Wang
Sujian Li
Tianyu Liu
Xiaodong Zhang
Zhicong Cheng
D. Yin
Houfeng Wang
238
28
0
17 Feb 2021
TCN: Table Convolutional Network for Web Table Interpretation
TCN: Table Convolutional Network for Web Table Interpretation
Daheng Wang
Prashant Shiralkar
Colin Lockard
Binxuan Huang
Xin Luna Dong
Meng Jiang
LMTD
69
55
0
17 Feb 2021
Transferability of Neural Network Clinical De-identification Systems
Transferability of Neural Network Clinical De-identification Systems
Kahyun Lee
Nicholas J. Dobbins
Bridget T. McInnes
Meliha Yetisgen
Özlem Uzuner
OOD
61
5
0
17 Feb 2021
A Context-Enhanced De-identification System
A Context-Enhanced De-identification System
Kahyun Lee
M. Kayaalp
Sam Henry
Özlem Uzuner
68
3
0
17 Feb 2021
Highly Fast Text Segmentation With Pairwise Markov Chains
Highly Fast Text Segmentation With Pairwise Markov Chains
E. Azeraf
E. Monfrini
Emmanuel Vignon
W. Pieczynski
56
5
0
17 Feb 2021
COCO-LM: Correcting and Contrasting Text Sequences for Language Model
  Pretraining
COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
Yu Meng
Chenyan Xiong
Payal Bajaj
Saurabh Tiwary
Paul N. Bennett
Jiawei Han
Xia Song
184
206
0
16 Feb 2021
IntSGD: Adaptive Floatless Compression of Stochastic Gradients
IntSGD: Adaptive Floatless Compression of Stochastic Gradients
Konstantin Mishchenko
Bokun Wang
D. Kovalev
Peter Richtárik
107
15
0
16 Feb 2021
Conversations Gone Alright: Quantifying and Predicting Prosocial
  Outcomes in Online Conversations
Conversations Gone Alright: Quantifying and Predicting Prosocial Outcomes in Online Conversations
Jiajun Bao
J. Wu
Yiming Zhang
Eshwar Chandrasekharan
David Jurgens
117
49
0
16 Feb 2021
Boosting Low-Resource Biomedical QA via Entity-Aware Masking Strategies
Boosting Low-Resource Biomedical QA via Entity-Aware Masking Strategies
Gabriele Pergola
E. Kochkina
Lin Gui
Maria Liakata
Yulan He
145
32
0
16 Feb 2021
NoiseQA: Challenge Set Evaluation for User-Centric Question Answering
NoiseQA: Challenge Set Evaluation for User-Centric Question Answering
Abhilasha Ravichander
Siddharth Dalmia
Maria Ryskina
Florian Metze
Eduard H. Hovy
A. Black
ELM
59
32
0
16 Feb 2021
Dataset Condensation with Differentiable Siamese Augmentation
Dataset Condensation with Differentiable Siamese Augmentation
Bo Zhao
Hakan Bilen
DD
295
305
0
16 Feb 2021
Exploring Transformers in Natural Language Generation: GPT, BERT, and
  XLNet
Exploring Transformers in Natural Language Generation: GPT, BERT, and XLNet
M. O. Topal
Anil Bas
Imke van Heerden
LLMAGAI4CE
73
91
0
16 Feb 2021
Improving speech recognition models with small samples for air traffic
  control systems
Improving speech recognition models with small samples for air traffic control systems
Yi Lin
Qin Li
Bo Yang
Zhen Yan
Huachun Tan
Zhengmao Chen
104
32
0
16 Feb 2021
TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale
  Language Models
TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models
Zhuohan Li
Siyuan Zhuang
Shiyuan Guo
Danyang Zhuo
Hao Zhang
Basel Alomair
Ion Stoica
MoE
104
125
0
16 Feb 2021
FEWS: Large-Scale, Low-Shot Word Sense Disambiguation with the
  Dictionary
FEWS: Large-Scale, Low-Shot Word Sense Disambiguation with the Dictionary
Terra Blevins
Mandar Joshi
Luke Zettlemoyer
90
21
0
16 Feb 2021
Training Larger Networks for Deep Reinforcement Learning
Training Larger Networks for Deep Reinforcement Learning
Keita Ota
Devesh K. Jha
Asako Kanezaki
OffRL
97
40
0
16 Feb 2021
Few-Shot Graph Learning for Molecular Property Prediction
Few-Shot Graph Learning for Molecular Property Prediction
Zhichun Guo
Chuxu Zhang
Wenhao Yu
John E. Herr
Olaf Wiest
Meng Jiang
Nitesh Chawla
AI4CE
177
176
0
16 Feb 2021
Within-Document Event Coreference with BERT-Based Contextualized Representations
Shafiuddin Rehan Ahmed
James H. Martin
20
0
0
15 Feb 2021
Previous
123...359360361...471472473
Next