ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 23,639 papers shown
Title
Typing Errors in Factual Knowledge Graphs: Severity and Possible Ways
  Out
Typing Errors in Factual Knowledge Graphs: Severity and Possible Ways Out
Peiran Yao
Denilson Barbosa
145
6
0
03 Feb 2021
Confusion2vec 2.0: Enriching Ambiguous Spoken Language Representations
  with Subwords
Confusion2vec 2.0: Enriching Ambiguous Spoken Language Representations with Subwords
Prashanth Gurunath Shivakumar
P. Georgiou
Shrikanth Narayanan
31
1
0
03 Feb 2021
DiSCoL: Toward Engaging Dialogue Systems through Conversational Line
  Guided Response Generation
DiSCoL: Toward Engaging Dialogue Systems through Conversational Line Guided Response Generation
Sarik Ghazarian
Zixi Liu
Tuhin Chakrabarty
Xuezhe Ma
Aram Galstyan
Nanyun Peng
54
11
0
03 Feb 2021
Learning to Select External Knowledge with Multi-Scale Negative Sampling
Learning to Select External Knowledge with Multi-Scale Negative Sampling
H. He
Hua Lu
Siqi Bao
Fan Wang
Hua Wu
Zhengyu Niu
Haifeng Wang
65
32
0
03 Feb 2021
Focusing Knowledge-based Graph Argument Mining via Topic Modeling
Focusing Knowledge-based Graph Argument Mining via Topic Modeling
Patricia B. Abels
Zahra Ahmadi
Sophie Burkhardt
Benjamin Schiller
Iryna Gurevych
Stefan Kramer
119
6
0
03 Feb 2021
Top-down Discourse Parsing via Sequence Labelling
Top-down Discourse Parsing via Sequence Labelling
Fajri Koto
Jey Han Lau
Timothy Baldwin
74
28
0
03 Feb 2021
Trusted Multi-View Classification
Trusted Multi-View Classification
Zongbo Han
Changqing Zhang
Huazhu Fu
Qiufeng Wang
EDL
86
174
0
03 Feb 2021
General-Purpose Speech Representation Learning through a Self-Supervised
  Multi-Granularity Framework
General-Purpose Speech Representation Learning through a Self-Supervised Multi-Granularity Framework
Yucheng Zhao
Dacheng Yin
Chong Luo
Zhiyuan Zhao
Chuanxin Tang
Wenjun Zeng
Zhengjun Zha
SSL
59
6
0
03 Feb 2021
HeBERT & HebEMO: a Hebrew BERT Model and a Tool for Polarity Analysis
  and Emotion Recognition
HeBERT & HebEMO: a Hebrew BERT Model and a Tool for Polarity Analysis and Emotion Recognition
Avihay Chriqui
I. Yahav
78
37
0
03 Feb 2021
Relaxed Transformer Decoders for Direct Action Proposal Generation
Relaxed Transformer Decoders for Direct Action Proposal Generation
Jing Tan
Jiaqi Tang
Limin Wang
Gangshan Wu
ViT
156
182
0
03 Feb 2021
Memorization vs. Generalization: Quantifying Data Leakage in NLP
  Performance Evaluation
Memorization vs. Generalization: Quantifying Data Leakage in NLP Performance Evaluation
Aparna Elangovan
Jiayuan He
Karin Verspoor
TDIFedML
218
95
0
03 Feb 2021
The GEM Benchmark: Natural Language Generation, its Evaluation and
  Metrics
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
Sebastian Gehrmann
Tosin Adewumi
Karmanya Aggarwal
Pawan Sasanka Ammanamanchi
Aremu Anuoluwapo
...
Nishant Subramani
Wei Xu
Diyi Yang
Akhila Yerukola
Jiawei Zhou
VLM
325
285
0
02 Feb 2021
Clickbait Headline Detection in Indonesian News Sites using Multilingual
  Bidirectional Encoder Representations from Transformers (M-BERT)
Clickbait Headline Detection in Indonesian News Sites using Multilingual Bidirectional Encoder Representations from Transformers (M-BERT)
M. N. Fakhruzzaman
S. Z. Jannah
R. A. Ningrum
Indah Fahmiyah
18
13
0
02 Feb 2021
MAUVE: Measuring the Gap Between Neural Text and Human Text using
  Divergence Frontiers
MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers
Krishna Pillutla
Swabha Swayamdipta
Rowan Zellers
John Thickstun
Sean Welleck
Yejin Choi
Zaïd Harchaoui
163
364
0
02 Feb 2021
AutoFreeze: Automatically Freezing Model Blocks to Accelerate
  Fine-tuning
AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning
Yuhan Liu
Saurabh Agarwal
Shivaram Venkataraman
OffRL
80
56
0
02 Feb 2021
Neural Data Augmentation via Example Extrapolation
Neural Data Augmentation via Example Extrapolation
Kenton Lee
Kelvin Guu
Luheng He
Timothy Dozat
Hyung Won Chung
78
72
0
02 Feb 2021
Scaling Laws for Transfer
Scaling Laws for Transfer
Danny Hernandez
Jared Kaplan
T. Henighan
Sam McCandlish
100
251
0
02 Feb 2021
MultiTalk: A Highly-Branching Dialog Testbed for Diverse Conversations
MultiTalk: A Highly-Branching Dialog Testbed for Diverse Conversations
Yao Dou
Maxwell Forbes
Ari Holtzman
Yejin Choi
63
8
0
02 Feb 2021
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and
  Aggregation
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation
Yuan Gong
Yu-An Chung
James R. Glass
VLM
202
147
0
02 Feb 2021
Generative Spoken Language Modeling from Raw Audio
Generative Spoken Language Modeling from Raw Audio
Kushal Lakhotia
Evgeny Kharitonov
Wei-Ning Hsu
Yossi Adi
Adam Polyak
...
Tu Nguyen
Jade Copet
Alexei Baevski
A. Mohamed
Emmanuel Dupoux
AuLLM
310
366
0
01 Feb 2021
Improving Distantly-Supervised Relation Extraction through BERT-based
  Label & Instance Embeddings
Improving Distantly-Supervised Relation Extraction through BERT-based Label & Instance Embeddings
Despina Christou
Grigorios Tsoumakas
80
39
0
01 Feb 2021
SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of
  Multilingual BERT models for Offensive Language Identification
SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of Multilingual BERT models for Offensive Language Identification
Sai Muralidhar Jayanthi
Akshat Gupta
VLM
63
31
0
01 Feb 2021
Measuring and Improving Consistency in Pretrained Language Models
Measuring and Improving Consistency in Pretrained Language Models
Yanai Elazar
Nora Kassner
Shauli Ravfogel
Abhilasha Ravichander
Eduard H. Hovy
Hinrich Schütze
Yoav Goldberg
HILM
350
371
0
01 Feb 2021
Revisiting the Prepositional-Phrase Attachment Problem Using Explicit
  Commonsense Knowledge
Revisiting the Prepositional-Phrase Attachment Problem Using Explicit Commonsense Knowledge
Yida Xin
H. Lieberman
Peter Chin
KELM
61
1
0
01 Feb 2021
Text-to-hashtag Generation using Seq2seq Learning
Text-to-hashtag Generation using Seq2seq Learning
A. Camargo
Wesley Carvalho
Felipe Peressim
Alan Barzilay
Marcelo Finger
53
1
0
01 Feb 2021
Civil Rephrases Of Toxic Texts With Self-Supervised Transformers
Civil Rephrases Of Toxic Texts With Self-Supervised Transformers
Leo Laugier
John Pavlopoulos
Jeffrey Scott Sorensen
Lucas Dixon
101
48
0
01 Feb 2021
Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained
  Language Models
Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained Language Models
Nora Kassner
Philipp Dufter
Hinrich Schütze
105
141
0
01 Feb 2021
Automatic Expansion of Domain-Specific Affective Models for Web
  Intelligence Applications
Automatic Expansion of Domain-Specific Affective Models for Web Intelligence Applications
A. Weichselbraun
Jakob Steixner
Adrian Brasoveanu
A. Scharl
Max C. Göbel
L. Nixon
58
11
0
01 Feb 2021
Many Hands Make Light Work: Using Essay Traits to Automatically Score
  Essays
Many Hands Make Light Work: Using Essay Traits to Automatically Score Essays
Rahul Kumar
Sandeep Albert Mathias
S. Saha
P. Bhattacharyya
75
30
0
01 Feb 2021
Video Transformer Network
Video Transformer Network
Daniel Neimark
Omri Bar
Maya Zohar
Dotan Asselmann
ViT
313
434
0
01 Feb 2021
Hierarchical Ranking for Answer Selection
Hierarchical Ranking for Answer Selection
Hang Gao
Mengting Hu
Renhong Cheng
Tiegang Gao
30
1
0
01 Feb 2021
Short Text Clustering with Transformers
Short Text Clustering with Transformers
Leonid Pugachev
Andrey Kravchenko
VLM
40
11
0
31 Jan 2021
Decoupling the Role of Data, Attention, and Losses in Multimodal
  Transformers
Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers
Lisa Anne Hendricks
John F. J. Mellor
R. Schneider
Jean-Baptiste Alayrac
Aida Nematzadeh
150
117
0
31 Jan 2021
A Runtime-Based Computational Performance Predictor for Deep Neural
  Network Training
A Runtime-Based Computational Performance Predictor for Deep Neural Network Training
Geoffrey X. Yu
Yubo Gao
P. Golikov
Gennady Pekhimenko
3DH
69
68
0
31 Jan 2021
TruthBot: An Automated Conversational Tool for Intent Learning, Curated
  Information Presenting, and Fake News Alerting
TruthBot: An Automated Conversational Tool for Intent Learning, Curated Information Presenting, and Fake News Alerting
Ankur Gupta
Yash Varun
Prarthana Das
Nithya Muttineni
Parth Srivastava
Hamim Zafar
Tanmoy Chakraborty
Swaprava Nath
39
7
0
31 Jan 2021
Extending Neural Keyword Extraction with TF-IDF tagset matching
Extending Neural Keyword Extraction with TF-IDF tagset matching
Boshko Koloski
Senja Pollak
Blaž Škrlj
Matej Martinc
33
10
0
31 Jan 2021
Adversarial Contrastive Pre-training for Protein Sequences
Adversarial Contrastive Pre-training for Protein Sequences
Matthew B. A. McDermott
Brendan Yap
Harry Hsu
Di Jin
Peter Szolovits
AAML
92
10
0
31 Jan 2021
Introduction of a novel word embedding approach based on technology
  labels extracted from patent data
Introduction of a novel word embedding approach based on technology labels extracted from patent data
M. Standke
Abdullah Kiwan
Annalena Lange
Silvan Berg
30
0
0
31 Jan 2021
An Empirical Study on the Generalization Power of Neural Representations
  Learned via Visual Guessing Games
An Empirical Study on the Generalization Power of Neural Representations Learned via Visual Guessing Games
Alessandro Suglia
Yonatan Bisk
Ioannis Konstas
Antonio Vergari
E. Bastianelli
Andrea Vanzo
Oliver Lemon
40
8
0
31 Jan 2021
The distance between the weights of the neural network is meaningful
The distance between the weights of the neural network is meaningful
Liqun Yang
Yijun Yang
Yao Wang
Zhenyu Yang
Wei Zeng
46
0
0
31 Jan 2021
Classification Models for Partially Ordered Sequences
Classification Models for Partially Ordered Sequences
Stephanie Ger
Diego Klabjan
J. Utke
31
0
0
31 Jan 2021
Speech Recognition by Simply Fine-tuning BERT
Speech Recognition by Simply Fine-tuning BERT
Wen-Chin Huang
Chia-Hua Wu
Shang-Bao Luo
Kuan-Yu Chen
Hsin-Min Wang
Tomoki Toda
126
28
0
30 Jan 2021
EmpathBERT: A BERT-based Framework for Demographic-aware Empathy
  Prediction
EmpathBERT: A BERT-based Framework for Demographic-aware Empathy Prediction
Bhanu Prakash Reddy Guda
Aparna Garimella
Niyati Chhaya
71
34
0
30 Jan 2021
Adversarially learning disentangled speech representations for robust
  multi-factor voice conversion
Adversarially learning disentangled speech representations for robust multi-factor voice conversion
Jie Wang
Jingbei Li
Xintao Zhao
Zhiyong Wu
Shiyin Kang
Helen Meng
DRL
129
29
0
30 Jan 2021
Can We Automate Scientific Reviewing?
Can We Automate Scientific Reviewing?
Weizhe Yuan
Pengfei Liu
Graham Neubig
168
90
0
30 Jan 2021
NLPBK at VLSP-2020 shared task: Compose transformer pretrained models
  for Reliable Intelligence Identification on Social network
NLPBK at VLSP-2020 shared task: Compose transformer pretrained models for Reliable Intelligence Identification on Social network
Thanh C. Nguyen
V. Nguyen
ViT
60
4
0
29 Jan 2021
CD2CR: Co-reference Resolution Across Documents and Domains
CD2CR: Co-reference Resolution Across Documents and Domains
James Ravenscroft
Arie Cattan
A. Clare
Ido Dagan
Maria Liakata
139
8
0
29 Jan 2021
Does injecting linguistic structure into language models lead to better
  alignment with brain recordings?
Does injecting linguistic structure into language models lead to better alignment with brain recordings?
Mostafa Abdou
Ana Valeria González
Mariya Toneva
Daniel Hershcovich
Anders Søgaard
72
16
0
29 Jan 2021
Peeler: Profiling Kernel-Level Events to Detect Ransomware
Peeler: Profiling Kernel-Level Events to Detect Ransomware
Muhammad Ejaz Ahmed
Hyoungshick Kim
S. Çamtepe
Surya Nepal
36
28
0
29 Jan 2021
Combining pre-trained language models and structured knowledge
Combining pre-trained language models and structured knowledge
Pedro Colon-Hernandez
Catherine Havasi
Jason B. Alonso
Matthew Huggins
C. Breazeal
KELM
93
48
0
28 Jan 2021
Previous
123...362363364...471472473
Next