ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 23,688 papers shown
Title
SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of
  Multilingual BERT models for Offensive Language Identification
SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of Multilingual BERT models for Offensive Language Identification
Sai Muralidhar Jayanthi
Akshat Gupta
VLM
63
31
0
01 Feb 2021
Measuring and Improving Consistency in Pretrained Language Models
Measuring and Improving Consistency in Pretrained Language Models
Yanai Elazar
Nora Kassner
Shauli Ravfogel
Abhilasha Ravichander
Eduard H. Hovy
Hinrich Schütze
Yoav Goldberg
HILM
359
371
0
01 Feb 2021
Revisiting the Prepositional-Phrase Attachment Problem Using Explicit
  Commonsense Knowledge
Revisiting the Prepositional-Phrase Attachment Problem Using Explicit Commonsense Knowledge
Yida Xin
H. Lieberman
Peter Chin
KELM
61
1
0
01 Feb 2021
Text-to-hashtag Generation using Seq2seq Learning
Text-to-hashtag Generation using Seq2seq Learning
A. Camargo
Wesley Carvalho
Felipe Peressim
Alan Barzilay
Marcelo Finger
55
1
0
01 Feb 2021
Civil Rephrases Of Toxic Texts With Self-Supervised Transformers
Civil Rephrases Of Toxic Texts With Self-Supervised Transformers
Leo Laugier
John Pavlopoulos
Jeffrey Scott Sorensen
Lucas Dixon
101
48
0
01 Feb 2021
Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained
  Language Models
Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained Language Models
Nora Kassner
Philipp Dufter
Hinrich Schütze
105
141
0
01 Feb 2021
Automatic Expansion of Domain-Specific Affective Models for Web
  Intelligence Applications
Automatic Expansion of Domain-Specific Affective Models for Web Intelligence Applications
A. Weichselbraun
Jakob Steixner
Adrian Brasoveanu
A. Scharl
Max C. Göbel
L. Nixon
58
11
0
01 Feb 2021
Many Hands Make Light Work: Using Essay Traits to Automatically Score
  Essays
Many Hands Make Light Work: Using Essay Traits to Automatically Score Essays
Rahul Kumar
Sandeep Albert Mathias
S. Saha
P. Bhattacharyya
75
30
0
01 Feb 2021
Video Transformer Network
Video Transformer Network
Daniel Neimark
Omri Bar
Maya Zohar
Dotan Asselmann
ViT
313
434
0
01 Feb 2021
Hierarchical Ranking for Answer Selection
Hierarchical Ranking for Answer Selection
Hang Gao
Mengting Hu
Renhong Cheng
Tiegang Gao
30
1
0
01 Feb 2021
Short Text Clustering with Transformers
Short Text Clustering with Transformers
Leonid Pugachev
Andrey Kravchenko
VLM
40
11
0
31 Jan 2021
Decoupling the Role of Data, Attention, and Losses in Multimodal
  Transformers
Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers
Lisa Anne Hendricks
John F. J. Mellor
R. Schneider
Jean-Baptiste Alayrac
Aida Nematzadeh
150
117
0
31 Jan 2021
A Runtime-Based Computational Performance Predictor for Deep Neural
  Network Training
A Runtime-Based Computational Performance Predictor for Deep Neural Network Training
Geoffrey X. Yu
Yubo Gao
P. Golikov
Gennady Pekhimenko
3DH
69
68
0
31 Jan 2021
TruthBot: An Automated Conversational Tool for Intent Learning, Curated
  Information Presenting, and Fake News Alerting
TruthBot: An Automated Conversational Tool for Intent Learning, Curated Information Presenting, and Fake News Alerting
Ankur Gupta
Yash Varun
Prarthana Das
Nithya Muttineni
Parth Srivastava
Hamim Zafar
Tanmoy Chakraborty
Swaprava Nath
39
7
0
31 Jan 2021
Extending Neural Keyword Extraction with TF-IDF tagset matching
Extending Neural Keyword Extraction with TF-IDF tagset matching
Boshko Koloski
Senja Pollak
Blaž Škrlj
Matej Martinc
33
10
0
31 Jan 2021
Adversarial Contrastive Pre-training for Protein Sequences
Adversarial Contrastive Pre-training for Protein Sequences
Matthew B. A. McDermott
Brendan Yap
Harry Hsu
Di Jin
Peter Szolovits
AAML
92
10
0
31 Jan 2021
Introduction of a novel word embedding approach based on technology
  labels extracted from patent data
Introduction of a novel word embedding approach based on technology labels extracted from patent data
M. Standke
Abdullah Kiwan
Annalena Lange
Silvan Berg
30
0
0
31 Jan 2021
An Empirical Study on the Generalization Power of Neural Representations
  Learned via Visual Guessing Games
An Empirical Study on the Generalization Power of Neural Representations Learned via Visual Guessing Games
Alessandro Suglia
Yonatan Bisk
Ioannis Konstas
Antonio Vergari
E. Bastianelli
Andrea Vanzo
Oliver Lemon
40
8
0
31 Jan 2021
The distance between the weights of the neural network is meaningful
The distance between the weights of the neural network is meaningful
Liqun Yang
Yijun Yang
Yao Wang
Zhenyu Yang
Wei Zeng
46
0
0
31 Jan 2021
Classification Models for Partially Ordered Sequences
Classification Models for Partially Ordered Sequences
Stephanie Ger
Diego Klabjan
J. Utke
31
0
0
31 Jan 2021
Speech Recognition by Simply Fine-tuning BERT
Speech Recognition by Simply Fine-tuning BERT
Wen-Chin Huang
Chia-Hua Wu
Shang-Bao Luo
Kuan-Yu Chen
Hsin-Min Wang
Tomoki Toda
126
28
0
30 Jan 2021
EmpathBERT: A BERT-based Framework for Demographic-aware Empathy
  Prediction
EmpathBERT: A BERT-based Framework for Demographic-aware Empathy Prediction
Bhanu Prakash Reddy Guda
Aparna Garimella
Niyati Chhaya
71
34
0
30 Jan 2021
Adversarially learning disentangled speech representations for robust
  multi-factor voice conversion
Adversarially learning disentangled speech representations for robust multi-factor voice conversion
Jie Wang
Jingbei Li
Xintao Zhao
Zhiyong Wu
Shiyin Kang
Helen Meng
DRL
129
29
0
30 Jan 2021
Can We Automate Scientific Reviewing?
Can We Automate Scientific Reviewing?
Weizhe Yuan
Pengfei Liu
Graham Neubig
168
90
0
30 Jan 2021
NLPBK at VLSP-2020 shared task: Compose transformer pretrained models
  for Reliable Intelligence Identification on Social network
NLPBK at VLSP-2020 shared task: Compose transformer pretrained models for Reliable Intelligence Identification on Social network
Thanh C. Nguyen
V. Nguyen
ViT
60
4
0
29 Jan 2021
CD2CR: Co-reference Resolution Across Documents and Domains
CD2CR: Co-reference Resolution Across Documents and Domains
James Ravenscroft
Arie Cattan
A. Clare
Ido Dagan
Maria Liakata
139
8
0
29 Jan 2021
Does injecting linguistic structure into language models lead to better
  alignment with brain recordings?
Does injecting linguistic structure into language models lead to better alignment with brain recordings?
Mostafa Abdou
Ana Valeria González
Mariya Toneva
Daniel Hershcovich
Anders Søgaard
75
16
0
29 Jan 2021
Peeler: Profiling Kernel-Level Events to Detect Ransomware
Peeler: Profiling Kernel-Level Events to Detect Ransomware
Muhammad Ejaz Ahmed
Hyoungshick Kim
S. Çamtepe
Surya Nepal
36
28
0
29 Jan 2021
Combining pre-trained language models and structured knowledge
Combining pre-trained language models and structured knowledge
Pedro Colon-Hernandez
Catherine Havasi
Jason B. Alonso
Matthew Huggins
C. Breazeal
KELM
93
48
0
28 Jan 2021
Self-Attention Meta-Learner for Continual Learning
Self-Attention Meta-Learner for Continual Learning
Ghada Sokar
Decebal Constantin Mocanu
Mykola Pechenizkiy
CLL
53
11
0
28 Jan 2021
A Neural Few-Shot Text Classification Reality Check
A Neural Few-Shot Text Classification Reality Check
Thomas Dopierre
Christophe Gravier
Wilfried Logerais
VLM
61
19
0
28 Jan 2021
VX2TEXT: End-to-End Learning of Video-Based Text Generation From
  Multimodal Inputs
VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs
Xudong Lin
Gedas Bertasius
Jue Wang
Shih-Fu Chang
Devi Parikh
Lorenzo Torresani
VGen
104
67
0
28 Jan 2021
BENDR: using transformers and a contrastive self-supervised learning
  task to learn from massive amounts of EEG data
BENDR: using transformers and a contrastive self-supervised learning task to learn from massive amounts of EEG data
Demetres Kostas
Stephane Aroca-Ouellette
Frank Rudzicz
SSL
120
210
0
28 Jan 2021
A transformer based approach for fighting COVID-19 fake news
A transformer based approach for fighting COVID-19 fake news
S. M. S. Shifath
Mohammad Faiyaz Khan
Md. Saiful Islam
MedIm
66
23
0
28 Jan 2021
BERTaú: Itaú BERT for digital customer service
BERTaú: Itaú BERT for digital customer service
Paulo Finardi
José Dié Viegas
Gustavo T. Ferreira
Alex F. Mansano
Vinicius Fernandes Caridá
64
11
0
28 Jan 2021
Tokens-to-Token ViT: Training Vision Transformers from Scratch on
  ImageNet
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
Li-xin Yuan
Yunpeng Chen
Tao Wang
Weihao Yu
Yujun Shi
Zihang Jiang
Francis E. H. Tay
Jiashi Feng
Shuicheng Yan
ViT
231
1,957
0
28 Jan 2021
Disembodied Machine Learning: On the Illusion of Objectivity in NLP
Disembodied Machine Learning: On the Illusion of Objectivity in NLP
Zeerak Talat
Smarika Lulz
Joachim Bingel
Isabelle Augenstein
172
51
0
28 Jan 2021
Attention Guided Dialogue State Tracking with Sparse Supervision
Attention Guided Dialogue State Tracking with Sparse Supervision
Shuailong Liang
Lahari Poddar
Gyuri Szarvas
67
4
0
28 Jan 2021
Identifying COVID-19 Fake News in Social Media
Identifying COVID-19 Fake News in Social Media
Tathagata Raha
Vijayasaradhi Indurthi
Aayush Upadhyaya
Jeevesh Kataria
Pramud Bommakanti
Vikram Keswani
Vasudeva Varma
GNNMedIm
61
12
0
28 Jan 2021
LESA: Linguistic Encapsulation and Semantic Amalgamation Based
  Generalised Claim Detection from Online Content
LESA: Linguistic Encapsulation and Semantic Amalgamation Based Generalised Claim Detection from Online Content
Shreya Gupta
Parantak Singh
Megha Sundriyal
Md. Shad Akhtar
Tanmoy Chakraborty
162
27
0
28 Jan 2021
Explaining Natural Language Processing Classifiers with Occlusion and
  Language Modeling
Explaining Natural Language Processing Classifiers with Occlusion and Language Modeling
David Harbecke
AAML
57
2
0
28 Jan 2021
Does Typological Blinding Impede Cross-Lingual Sharing?
Does Typological Blinding Impede Cross-Lingual Sharing?
Johannes Bjerva
Isabelle Augenstein
83
17
0
28 Jan 2021
POD-DL-ROM: enhancing deep learning-based reduced order models for
  nonlinear parametrized PDEs by proper orthogonal decomposition
POD-DL-ROM: enhancing deep learning-based reduced order models for nonlinear parametrized PDEs by proper orthogonal decomposition
S. Fresca
Andrea Manzoni
AI4CE
74
220
0
28 Jan 2021
DRAG: Director-Generator Language Modelling Framework for Non-Parallel
  Author Stylized Rewriting
DRAG: Director-Generator Language Modelling Framework for Non-Parallel Author Stylized Rewriting
Hrituraj Singh
Gaurav Verma
Aparna Garimella
Balaji Vasan Srinivasan
DiffM
41
6
0
28 Jan 2021
Weakly Supervised Neuro-Symbolic Module Networks for Numerical Reasoning
Weakly Supervised Neuro-Symbolic Module Networks for Numerical Reasoning
Amrita Saha
Shafiq Joty
Guosheng Lin
NAIAIMatLRM
59
20
0
28 Jan 2021
BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language
  Generation
BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation
Jwala Dhamala
Tony Sun
Varun Kumar
Satyapriya Krishna
Yada Pruksachatkun
Kai-Wei Chang
Rahul Gupta
113
403
0
27 Jan 2021
Knowledge-driven Natural Language Understanding of English Text and its
  Applications
Knowledge-driven Natural Language Understanding of English Text and its Applications
Kinjal Basu
S. Varanasi
Farhad Shakerin
Joaquín Arias
G. Gupta
59
26
0
27 Jan 2021
CNN with large memory layers
CNN with large memory layers
R. Karimov
Yury Malkov
Karim Iskakov
Victor Lempitsky
58
0
0
27 Jan 2021
Bottleneck Transformers for Visual Recognition
Bottleneck Transformers for Visual Recognition
A. Srinivas
Nayeon Lee
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
417
999
0
27 Jan 2021
Scheduled Sampling in Vision-Language Pretraining with Decoupled
  Encoder-Decoder Network
Scheduled Sampling in Vision-Language Pretraining with Decoupled Encoder-Decoder Network
Yehao Li
Yingwei Pan
Ting Yao
Jingwen Chen
Tao Mei
VLM
95
53
0
27 Jan 2021
Previous
123...363364365...472473474
Next