ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLM
    SSL
    SSeg
ArXivPDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 18,599 papers shown
Title
Adma: A Flexible Loss Function for Neural Networks
Adma: A Flexible Loss Function for Neural Networks
A. Shrivastava
22
1
0
23 Jul 2020
Clustering of Social Media Messages for Humanitarian Aid Response during
  Crisis
Clustering of Social Media Messages for Humanitarian Aid Response during Crisis
Swati Padhee
T. K. Saha
Joel R. Tetreault
A. Jaimes
22
6
0
23 Jul 2020
Multi-Task Curriculum Framework for Open-Set Semi-Supervised Learning
Multi-Task Curriculum Framework for Open-Set Semi-Supervised Learning
Qing Yu
Daiki Ikami
Go Irie
Kiyoharu Aizawa
23
128
0
22 Jul 2020
Multi-modal Transformer for Video Retrieval
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
436
596
0
21 Jul 2020
CovidDeep: SARS-CoV-2/COVID-19 Test Based on Wearable Medical Sensors
  and Efficient Neural Networks
CovidDeep: SARS-CoV-2/COVID-19 Test Based on Wearable Medical Sensors and Efficient Neural Networks
Shayan Hassantabar
Novati Stefano
Vishweshwar Ghanakota
A. Ferrari
G. Nicola
R. Bruno
I. Marino
Kenza Hamidouche
N. Jha
23
69
0
20 Jul 2020
Conformer-Kernel with Query Term Independence for Document Retrieval
Conformer-Kernel with Query Term Independence for Document Retrieval
Bhaskar Mitra
Sebastian Hofstatter
Hamed Zamani
Nick Craswell
27
21
0
20 Jul 2020
A Comprehensive Evaluation of Multi-task Learning and Multi-task
  Pre-training on EHR Time-series Data
A Comprehensive Evaluation of Multi-task Learning and Multi-task Pre-training on EHR Time-series Data
Matthew B. A. McDermott
Bret A. Nestor
Evan Kim
Wancong Zhang
Anna Goldenberg
Peter Szolovits
Marzyeh Ghassemi Csail
AI4TS
22
16
0
20 Jul 2020
Length-Controllable Image Captioning
Length-Controllable Image Captioning
Chaorui Deng
Ning Ding
Mingkui Tan
Qi Wu
VLM
33
56
0
19 Jul 2020
Deep Learning Based Brain Tumor Segmentation: A Survey
Deep Learning Based Brain Tumor Segmentation: A Survey
Zhihua Liu
Lei Tong
Zheheng Jiang
Long Chen
Feixiang Zhou
Qianni Zhang
Xiangrong Zhang
Ling Li
Huiyu Zhou
3DV
31
228
0
18 Jul 2020
Compositional Generalization in Semantic Parsing: Pre-training vs.
  Specialized Architectures
Compositional Generalization in Semantic Parsing: Pre-training vs. Specialized Architectures
Daniel Furrer
Marc van Zee
Nathan Scales
Nathanael Scharli
CoGe
26
113
0
17 Jul 2020
Knowledge-Based Video Question Answering with Unsupervised Scene
  Descriptions
Knowledge-Based Video Question Answering with Unsupervised Scene Descriptions
Noa Garcia
Yuta Nakashima
28
32
0
17 Jul 2020
Modern Hopfield Networks and Attention for Immune Repertoire
  Classification
Modern Hopfield Networks and Attention for Immune Repertoire Classification
Michael Widrich
Bernhard Schafl
Hubert Ramsauer
Milena Pavlović
Lukas Gruber
...
Johannes Brandstetter
G. K. Sandve
Victor Greiff
Sepp Hochreiter
Günter Klambauer
193
117
0
16 Jul 2020
FTRANS: Energy-Efficient Acceleration of Transformers using FPGA
FTRANS: Energy-Efficient Acceleration of Transformers using FPGA
Bingbing Li
Santosh Pandey
Haowen Fang
Yanjun Lyv
Ji Li
Jieyang Chen
Mimi Xie
Lipeng Wan
Hang Liu
Caiwen Ding
AI4CE
16
170
0
16 Jul 2020
Hopfield Networks is All You Need
Hopfield Networks is All You Need
Hubert Ramsauer
Bernhard Schafl
Johannes Lehner
Philipp Seidl
Michael Widrich
...
David P. Kreil
Michael K Kopp
Günter Klambauer
Johannes Brandstetter
Sepp Hochreiter
26
416
0
16 Jul 2020
Hierarchical Interaction Networks with Rethinking Mechanism for
  Document-level Sentiment Analysis
Hierarchical Interaction Networks with Rethinking Mechanism for Document-level Sentiment Analysis
Lingwei Wei
Dou Hu
Wei Zhou
Xuehai Tang
Xiaodan Zhang
Xin Wang
Jizhong Han
Songlin Hu
46
11
0
16 Jul 2020
Investigating Pretrained Language Models for Graph-to-Text Generation
Investigating Pretrained Language Models for Graph-to-Text Generation
Leonardo F. R. Ribeiro
Martin Schmitt
Hinrich Schütze
Iryna Gurevych
19
216
0
16 Jul 2020
SLK-NER: Exploiting Second-order Lexicon Knowledge for Chinese NER
SLK-NER: Exploiting Second-order Lexicon Knowledge for Chinese NER
Dou Hu
Lingwei Wei
21
25
0
16 Jul 2020
Deep Learning in Protein Structural Modeling and Design
Deep Learning in Protein Structural Modeling and Design
Wenhao Gao
S. Mahajan
Jeremias Sulam
Jeffrey J. Gray
42
159
0
16 Jul 2020
Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous
  Graphs
Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous Graphs
Dasol Hwang
Jinyoung Park
Sunyoung Kwon
KyungHyun Kim
Jung-Woo Ha
Hyunwoo J. Kim
44
67
0
16 Jul 2020
Learning from Noisy Labels with Deep Neural Networks: A Survey
Learning from Noisy Labels with Deep Neural Networks: A Survey
Hwanjun Song
Minseok Kim
Dongmin Park
Yooju Shin
Jae-Gil Lee
NoLa
26
965
0
16 Jul 2020
LogiQA: A Challenge Dataset for Machine Reading Comprehension with
  Logical Reasoning
LogiQA: A Challenge Dataset for Machine Reading Comprehension with Logical Reasoning
Jian Liu
Leyang Cui
Hanmeng Liu
Dandan Huang
Yile Wang
Yue Zhang
RALM
25
342
0
16 Jul 2020
Attention-Based Query Expansion Learning
Attention-Based Query Expansion Learning
Albert Gordo
Filip Radenovic
Tamara L. Berg
27
32
0
15 Jul 2020
Fighting the COVID-19 Infodemic in Social Media: A Holistic Perspective
  and a Call to Arms
Fighting the COVID-19 Infodemic in Social Media: A Holistic Perspective and a Call to Arms
Firoj Alam
Fahim Dalvi
Shaden Shaar
Nadir Durrani
Hamdy Mubarak
...
Giovanni Da San Martino
Ahmed Abdelali
Hassan Sajjad
Kareem Darwish
Preslav Nakov
34
102
0
15 Jul 2020
Fine-Tune Longformer for Jointly Predicting Rumor Stance and Veracity
Fine-Tune Longformer for Jointly Predicting Rumor Stance and Veracity
Anant Khandelwal
30
22
0
15 Jul 2020
UniTrans: Unifying Model Transfer and Data Transfer for Cross-Lingual
  Named Entity Recognition with Unlabeled Data
UniTrans: Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data
Qianhui Wu
Zijia Lin
Börje F. Karlsson
Biqing Huang
Jian-Guang Lou
21
46
0
15 Jul 2020
Predicting Clinical Diagnosis from Patients Electronic Health Records
  Using BERT-based Neural Networks
Predicting Clinical Diagnosis from Patients Electronic Health Records Using BERT-based Neural Networks
Pavel Blinov
Manvel Avetisian
V. Kokh
Dmitry Umerenkov
Alexander Tuzhilin
37
19
0
15 Jul 2020
Emoji Prediction: Extensions and Benchmarking
Emoji Prediction: Extensions and Benchmarking
Weicheng Ma
Ruibo Liu
Lili Wang
Soroush Vosoughi
19
19
0
14 Jul 2020
Deep learning models for representing out-of-vocabulary words
Deep learning models for representing out-of-vocabulary words
Johannes V. Lochter
Renato M. Silva
Tiago A. Almeida
22
15
0
14 Jul 2020
Optimizing Memory Placement using Evolutionary Graph Reinforcement
  Learning
Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning
Shauharda Khadka
Estelle Aflalo
Mattias Marder
Avrech Ben-David
Santiago Miret
Shie Mannor
Tamir Hazan
Hanlin Tang
Somdeb Majumdar
GNN
32
11
0
14 Jul 2020
CoreGen: Contextualized Code Representation Learning for Commit Message
  Generation
CoreGen: Contextualized Code Representation Learning for Commit Message Generation
L. Nie
Cuiyun Gao
Zhicong Zhong
Wai Lam
Yang Liu
Zenglin Xu
29
46
0
14 Jul 2020
Compare and Reweight: Distinctive Image Captioning Using Similar Images
  Sets
Compare and Reweight: Distinctive Image Captioning Using Similar Images Sets
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
37
45
0
14 Jul 2020
An Empirical Study on Robustness to Spurious Correlations using
  Pre-trained Language Models
An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models
Lifu Tu
Garima Lalwani
Spandana Gella
He He
LRM
33
184
0
14 Jul 2020
Can neural networks acquire a structural bias from raw linguistic data?
Can neural networks acquire a structural bias from raw linguistic data?
Alex Warstadt
Samuel R. Bowman
AI4CE
20
53
0
14 Jul 2020
T-Basis: a Compact Representation for Neural Networks
T-Basis: a Compact Representation for Neural Networks
Anton Obukhov
M. Rakhuba
Stamatios Georgoulis
Menelaos Kanakis
Dengxin Dai
Luc Van Gool
41
27
0
13 Jul 2020
Learning Reasoning Strategies in End-to-End Differentiable Proving
Learning Reasoning Strategies in End-to-End Differentiable Proving
Pasquale Minervini
Sebastian Riedel
Pontus Stenetorp
Edward Grefenstette
Tim Rocktaschel
LRM
45
96
0
13 Jul 2020
Reducing Language Biases in Visual Question Answering with
  Visually-Grounded Question Encoder
Reducing Language Biases in Visual Question Answering with Visually-Grounded Question Encoder
K. Gouthaman
Anurag Mittal
52
78
0
13 Jul 2020
TERA: Self-Supervised Learning of Transformer Encoder Representation for
  Speech
TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech
Andy T. Liu
Shang-Wen Li
Hung-yi Lee
SSL
67
356
0
12 Jul 2020
Stance Detection in Web and Social Media: A Comparative Study
Stance Detection in Web and Social Media: A Comparative Study
Shalmoli Ghosh
Prajwal Singhania
Siddharth Singh
Koustav Rudra
Saptarshi Ghosh
11
76
0
12 Jul 2020
Is Machine Learning Speaking my Language? A Critical Look at the
  NLP-Pipeline Across 8 Human Languages
Is Machine Learning Speaking my Language? A Critical Look at the NLP-Pipeline Across 8 Human Languages
Esma Wali
Yan Chen
Christopher Mahoney
Thomas Middleton
M. Babaeianjelodar
Mariama Njie
Jeanna Neefe Matthews
19
9
0
11 Jul 2020
Transformer-XL Based Music Generation with Multiple Sequences of
  Time-valued Notes
Transformer-XL Based Music Generation with Multiple Sequences of Time-valued Notes
Xianchao Wu
Chengyuan Wang
Qinying Lei
22
19
0
11 Jul 2020
Neural Knowledge Extraction From Cloud Service Incidents
Neural Knowledge Extraction From Cloud Service Incidents
Manish Shetty
Chetan Bansal
Sumit Kumar
Nikitha Rao
Nachiappan Nagappan
Thomas Zimmermann
31
17
0
10 Jul 2020
One Policy to Control Them All: Shared Modular Policies for
  Agent-Agnostic Control
One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control
Wenlong Huang
Igor Mordatch
Deepak Pathak
51
167
0
09 Jul 2020
Generalized Few-Shot Video Classification with Video Retrieval and
  Feature Generation
Generalized Few-Shot Video Classification with Video Retrieval and Feature Generation
Yongqin Xian
Bruno Korbar
Matthijs Douze
Lorenzo Torresani
Bernt Schiele
Zeynep Akata
VGen
18
18
0
09 Jul 2020
Learning Speech Representations from Raw Audio by Joint Audiovisual
  Self-Supervision
Learning Speech Representations from Raw Audio by Joint Audiovisual Self-Supervision
Abhinav Shukla
Stavros Petridis
Maja Pantic
SSL
32
16
0
08 Jul 2020
Remix: Rebalanced Mixup
Remix: Rebalanced Mixup
Hsin-Ping Chou
Shih-Chieh Chang
Jia-Yu Pan
Wei Wei
Da-Cheng Juan
41
232
0
08 Jul 2020
Targeting the Benchmark: On Methodology in Current Natural Language
  Processing Research
Targeting the Benchmark: On Methodology in Current Natural Language Processing Research
David Schlangen
33
57
0
07 Jul 2020
Continual BERT: Continual Learning for Adaptive Extractive Summarization
  of COVID-19 Literature
Continual BERT: Continual Learning for Adaptive Extractive Summarization of COVID-19 Literature
Jongjin Park
CLL
33
15
0
07 Jul 2020
DS-Sync: Addressing Network Bottlenecks with Divide-and-Shuffle
  Synchronization for Distributed DNN Training
DS-Sync: Addressing Network Bottlenecks with Divide-and-Shuffle Synchronization for Distributed DNN Training
Weiyan Wang
Cengguang Zhang
Liu Yang
Kai Chen
Kun Tan
34
12
0
07 Jul 2020
Deep Contextual Embeddings for Address Classification in E-commerce
Deep Contextual Embeddings for Address Classification in E-commerce
Shreyas Mangalgi
Lakshya Kumar
Ravindra Babu Tallamraju
25
8
0
06 Jul 2020
DART: Open-Domain Structured Data Record to Text Generation
DART: Open-Domain Structured Data Record to Text Generation
Linyong Nan
Dragomir R. Radev
Rui Zhang
Amrit Rau
Abhinand Sivaprasad
...
Y. Tan
Xi Lin
Caiming Xiong
R. Socher
Nazneen Rajani
17
199
0
06 Jul 2020
Previous
123...338339340...370371372
Next