ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 23,653 papers shown
Title
Automated essay scoring using efficient transformer-based language
  models
Automated essay scoring using efficient transformer-based language models
C. Ormerod
Akanksha Malhotra
Amir Jafari
51
31
0
25 Feb 2021
Investigating the Limitations of Transformers with Simple Arithmetic
  Tasks
Investigating the Limitations of Transformers with Simple Arithmetic Tasks
Rodrigo Nogueira
Zhiying Jiang
Jimmy J. Li
LRM
133
130
0
25 Feb 2021
Are pre-trained text representations useful for multilingual and
  multi-dimensional language proficiency modeling?
Are pre-trained text representations useful for multilingual and multi-dimensional language proficiency modeling?
Taraka Rama
Sowmya Vajjala
38
6
0
25 Feb 2021
Self-Tuning for Data-Efficient Deep Learning
Self-Tuning for Data-Efficient Deep Learning
Ximei Wang
Jing Gao
Mingsheng Long
Jianmin Wang
BDL
91
71
0
25 Feb 2021
SparseBERT: Rethinking the Importance Analysis in Self-attention
SparseBERT: Rethinking the Importance Analysis in Self-attention
Han Shi
Jiahui Gao
Xiaozhe Ren
Hang Xu
Xiaodan Liang
Zhenguo Li
James T. Kwok
101
54
0
25 Feb 2021
Spanish Biomedical and Clinical Language Embeddings
Spanish Biomedical and Clinical Language Embeddings
Asier Gutiérrez-Fandiño
Jordi Armengol-Estapé
C. Carrino
Ona de Gibert
Aitor Gonzalez-Agirre
Marta Villegas
41
5
0
25 Feb 2021
MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in
  Frames
MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Nobukatsu Hojo
73
60
0
25 Feb 2021
ZJUKLAB at SemEval-2021 Task 4: Negative Augmentation with Language
  Model for Reading Comprehension of Abstract Meaning
ZJUKLAB at SemEval-2021 Task 4: Negative Augmentation with Language Model for Reading Comprehension of Abstract Meaning
Xin Xie
Xiangnan Chen
Xiang Chen
Yong Wang
Ningyu Zhang
Shumin Deng
Huajun Chen
68
2
0
25 Feb 2021
LazyFormer: Self Attention with Lazy Update
LazyFormer: Self Attention with Lazy Update
Chengxuan Ying
Guolin Ke
Di He
Tie-Yan Liu
81
16
0
25 Feb 2021
Sentiment Analysis of Persian-English Code-mixed Texts
Sentiment Analysis of Persian-English Code-mixed Texts
Nazanin Sabri
Ali Edalat
B. Bahrak
64
22
0
25 Feb 2021
LET: Linguistic Knowledge Enhanced Graph Transformer for Chinese Short
  Text Matching
LET: Linguistic Knowledge Enhanced Graph Transformer for Chinese Short Text Matching
Boer Lyu
Lu Chen
Su Zhu
Kai Yu
111
49
0
25 Feb 2021
Automatic Story Generation: Challenges and Attempts
Automatic Story Generation: Challenges and Attempts
Amal Alabdulkarim
Siyan Li
Xiangyu Peng
68
51
0
25 Feb 2021
How to represent part-whole hierarchies in a neural network
How to represent part-whole hierarchies in a neural network
Geoffrey E. Hinton
OCLMoE
104
205
0
25 Feb 2021
Directional Bias Amplification
Directional Bias Amplification
Angelina Wang
Olga Russakovsky
79
70
0
24 Feb 2021
The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19
  Cough, COVID-19 Speech, Escalation & Primates
The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 Cough, COVID-19 Speech, Escalation & Primates
Björn W. Schuller
A. Batliner
Christian Bergler
Cecilia Mascolo
Jing Han
...
Pietro Cicuta
L. Rothkrantz
J. Zwerts
Jelle Treep
Casper S. Kaandorp
106
113
0
24 Feb 2021
RoBERTa-wwm-ext Fine-Tuning for Chinese Text Classification
RoBERTa-wwm-ext Fine-Tuning for Chinese Text Classification
Zhuo Xu
48
16
0
24 Feb 2021
Probing Classifiers: Promises, Shortcomings, and Advances
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
337
458
0
24 Feb 2021
Task-Specific Pre-Training and Cross Lingual Transfer for Code-Switched
  Data
Task-Specific Pre-Training and Cross Lingual Transfer for Code-Switched Data
Akshat Gupta
Sai Krishna Rallabandi
A. Black
72
13
0
24 Feb 2021
Beyond Fine-Tuning: Transferring Behavior in Reinforcement Learning
Beyond Fine-Tuning: Transferring Behavior in Reinforcement Learning
Victor Campos
Pablo Sprechmann
Steven Hansen
André Barreto
Steven Kapturowski
Alex Vitvitskyi
Adria Puigdomenech Badia
Charles Blundell
OffRLOnRL
85
26
0
24 Feb 2021
Pre-Training on Dynamic Graph Neural Networks
Pre-Training on Dynamic Graph Neural Networks
Ke-Jia Chen
Jiajun Zhang
Linpu Jiang
Yunyun Wang
Yuxuan Dai
AI4CE
83
15
0
24 Feb 2021
Re-Evaluating GermEval17 Using German Pre-Trained Language Models
Re-Evaluating GermEval17 Using German Pre-Trained Language Models
Yi Men
A. Corvonato
C. Heumann
VLM
74
6
0
24 Feb 2021
A Framework for Integrating Gesture Generation Models into Interactive
  Conversational Agents
A Framework for Integrating Gesture Generation Models into Interactive Conversational Agents
Rajmund Nagy
Taras Kucherenko
Birger Moell
André Pereira
Hedvig Kjellström
Ulysses Bernardet
119
12
0
24 Feb 2021
Trajectory-Based Meta-Learning for Out-Of-Vocabulary Word Embedding
  Learning
Trajectory-Based Meta-Learning for Out-Of-Vocabulary Word Embedding Learning
Gordon Buck
Andreas Vlachos
61
1
0
24 Feb 2021
From Universal Language Model to Downstream Task: Improving
  RoBERTa-Based Vietnamese Hate Speech Detection
From Universal Language Model to Downstream Task: Improving RoBERTa-Based Vietnamese Hate Speech Detection
Quang Huu Pham
Viet-Anh Nguyen
Linh Bao Doan
Ngoc N. Tran
Ta Minh Thanh
34
11
0
24 Feb 2021
Augmenting Part-of-speech Tagging with Syntactic Information for Vietnamese and Chinese
Duc-Vu Nguyen
Kiet Van Nguyen
Ngan Luu-Thuy Nguyen
58
0
0
24 Feb 2021
OneStop QAMaker: Extract Question-Answer Pairs from Text in a One-Stop
  Approach
OneStop QAMaker: Extract Question-Answer Pairs from Text in a One-Stop Approach
Shaobo Cui
Xintong Bao
Xinxing Zu
Yangyang Guo
Zhongzhou Zhao
Ji Zhang
Haiqing Chen
RALM
55
15
0
24 Feb 2021
Hopeful_Men@LT-EDI-EACL2021: Hope Speech Detection Using Indic
  Transliteration and Transformers
Hopeful_Men@LT-EDI-EACL2021: Hope Speech Detection Using Indic Transliteration and Transformers
I. S. Upadhyay
E. Nikhil
Anshul Wadhawan
R. Mamidi
54
14
0
24 Feb 2021
Do Transformer Modifications Transfer Across Implementations and
  Applications?
Do Transformer Modifications Transfer Across Implementations and Applications?
Sharan Narang
Hyung Won Chung
Yi Tay
W. Fedus
Thibault Févry
...
Wei Li
Nan Ding
Jake Marcus
Adam Roberts
Colin Raffel
112
128
0
23 Feb 2021
Neural ranking models for document retrieval
Neural ranking models for document retrieval
M. Trabelsi
Zhiyu Zoey Chen
Brian D. Davison
J. Heflin
FedML
85
29
0
23 Feb 2021
Automated Quality Assessment of Cognitive Behavioral Therapy Sessions
  Through Highly Contextualized Language Representations
Automated Quality Assessment of Cognitive Behavioral Therapy Sessions Through Highly Contextualized Language Representations
Nikolaos Flemotomos
Víctor R. Martínez
Zhuohao Chen
Torrey A. Creed
David C. Atkins
Shrikanth Narayanan
67
31
0
23 Feb 2021
Robust and Transferable Anomaly Detection in Log Data using Pre-Trained
  Language Models
Robust and Transferable Anomaly Detection in Log Data using Pre-Trained Language Models
Harold Ott
Jasmin Bogatinovski
Alexander Acker
S. Nedelkoski
O. Kao
23
31
0
23 Feb 2021
V2W-BERT: A Framework for Effective Hierarchical Multiclass
  Classification of Software Vulnerabilities
V2W-BERT: A Framework for Effective Hierarchical Multiclass Classification of Software Vulnerabilities
Siddhartha Shankar Das
Edoardo Serra
M. Halappanavar
A. Pothen
E. Al-Shaer
51
50
0
23 Feb 2021
Controllable and Diverse Text Generation in E-commerce
Controllable and Diverse Text Generation in E-commerce
Huajie Shao
Jianing Wang
Haohong Lin
Xuezhou Zhang
Aston Zhang
Heng Ji
Tarek Abdelzaher
103
30
0
23 Feb 2021
Minimally-Supervised Structure-Rich Text Categorization via Learning on
  Text-Rich Networks
Minimally-Supervised Structure-Rich Text Categorization via Learning on Text-Rich Networks
Xinyang Zhang
Chenwei Zhang
Xin Luna Dong
Jingbo Shang
Jiawei Han
65
18
0
23 Feb 2021
VisualCheXbert: Addressing the Discrepancy Between Radiology Report
  Labels and Image Labels
VisualCheXbert: Addressing the Discrepancy Between Radiology Report Labels and Image Labels
Saahil Jain
Akshay Smit
Steven QH Truong
C. Nguyen
Minh-Thanh Huynh
Mudit Jain
Victoria A Young
A. Ng
M. Lungren
Pranav Rajpurkar
MedIm
95
31
0
23 Feb 2021
Parallelizing Legendre Memory Unit Training
Parallelizing Legendre Memory Unit Training
Narsimha Chilkuri
C. Eliasmith
94
39
0
22 Feb 2021
Automated Evaluation Of Psychotherapy Skills Using Speech And Language
  Technologies
Automated Evaluation Of Psychotherapy Skills Using Speech And Language Technologies
Nikolaos Flemotomos
Víctor R. Martínez
Zhuohao Chen
Karan Singla
V. Ardulov
...
S. P. Lord
Tad Hirsch
Zac E. Imel
David C. Atkins
Shrikanth Narayanan
75
47
0
22 Feb 2021
Linear Transformers Are Secretly Fast Weight Programmers
Linear Transformers Are Secretly Fast Weight Programmers
Imanol Schlag
Kazuki Irie
Jürgen Schmidhuber
155
252
0
22 Feb 2021
Domain Adaptation in Dialogue Systems using Transfer and Meta-Learning
Domain Adaptation in Dialogue Systems using Transfer and Meta-Learning
Rui Ribeiro
A. Abad
J. Lopes
OffRL
35
1
0
22 Feb 2021
Probing Multimodal Embeddings for Linguistic Properties: the
  Visual-Semantic Case
Probing Multimodal Embeddings for Linguistic Properties: the Visual-Semantic Case
Adam Dahlgren Lindström
Suna Bensch
Johanna Björklund
F. Drewes
61
20
0
22 Feb 2021
Generating Human Readable Transcript for Automatic Speech Recognition
  with Pre-trained Language Model
Generating Human Readable Transcript for Automatic Speech Recognition with Pre-trained Language Model
Junwei Liao
Yu Shi
Ming Gong
Linjun Shou
Sefik Emre Eskimez
Liyang Lu
Hong Qu
Michael Zeng
36
9
0
22 Feb 2021
Towards Causal Representation Learning
Towards Causal Representation Learning
Bernhard Schölkopf
Francesco Locatello
Stefan Bauer
Nan Rosemary Ke
Nal Kalchbrenner
Anirudh Goyal
Yoshua Bengio
OODCMLAI4CE
166
323
0
22 Feb 2021
Position Information in Transformers: An Overview
Position Information in Transformers: An Overview
Philipp Dufter
Martin Schmitt
Hinrich Schütze
107
149
0
22 Feb 2021
LogME: Practical Assessment of Pre-trained Models for Transfer Learning
LogME: Practical Assessment of Pre-trained Models for Transfer Learning
Kaichao You
Yong Liu
Jianmin Wang
Mingsheng Long
107
190
0
22 Feb 2021
Better Call the Plumber: Orchestrating Dynamic Information Extraction
  Pipelines
Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines
M. Y. Jaradeh
Kuldeep Singh
M. Stocker
A. Both
Sören Auer
66
7
0
22 Feb 2021
RUBERT: A Bilingual Roman Urdu BERT Using Cross Lingual Transfer
  Learning
RUBERT: A Bilingual Roman Urdu BERT Using Cross Lingual Transfer Learning
Usama Khalid
M. O. Beg
Muhammad Umair Arshad
64
11
0
22 Feb 2021
Using Prior Knowledge to Guide BERT's Attention in Semantic Textual
  Matching Tasks
Using Prior Knowledge to Guide BERT's Attention in Semantic Textual Matching Tasks
Tingyu Xia
Yue Wang
Yuan Tian
Yi-Ju Chang
65
51
0
22 Feb 2021
Joint Intent Detection And Slot Filling Based on Continual Learning
  Model
Joint Intent Detection And Slot Filling Based on Continual Learning Model
Yanfei Hui
Jianzong Wang
Ning Cheng
Fengying Yu
Tianbo Wu
Jing Xiao
42
15
0
22 Feb 2021
Revisiting Classification Perspective on Scene Text Recognition
Revisiting Classification Perspective on Scene Text Recognition
Hongxiang Cai
Jun Sun
Yichao Xiong
81
10
0
22 Feb 2021
Subword Pooling Makes a Difference
Subword Pooling Makes a Difference
Judit Ács
Ákos Kádár
András Kornai
55
30
0
22 Feb 2021
Previous
123...358359360...472473474
Next