ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 23,688 papers shown
Title
Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word
  Alignment
Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment
Zewen Chi
Li Dong
Bo Zheng
Shaohan Huang
Xian-Ling Mao
Heyan Huang
Furu Wei
121
70
0
11 Jun 2021
Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word
  Substitution
Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution
Fanchao Qi
Yuan Yao
Sophia Xu
Zhiyuan Liu
Maosong Sun
SILM
77
132
0
11 Jun 2021
Dynamic Language Models for Continuously Evolving Content
Dynamic Language Models for Continuously Evolving Content
Spurthi Amba Hombaiah
Tao Chen
Mingyang Zhang
Michael Bendersky
Marc Najork
CLLKELM
103
38
0
11 Jun 2021
Going Beyond Linear Transformers with Recurrent Fast Weight Programmers
Going Beyond Linear Transformers with Recurrent Fast Weight Programmers
Kazuki Irie
Imanol Schlag
Róbert Csordás
Jürgen Schmidhuber
118
64
0
11 Jun 2021
EPICURE Ensemble Pretrained Models for Extracting Cancer Mutations from
  Literature
EPICURE Ensemble Pretrained Models for Extracting Cancer Mutations from Literature
Jiarun Cao
E. M. Veen
Niels Peek
A. Renehan
Sophia Ananiadou
15
11
0
11 Jun 2021
AugNet: End-to-End Unsupervised Visual Representation Learning with
  Image Augmentation
AugNet: End-to-End Unsupervised Visual Representation Learning with Image Augmentation
Mingxiang Chen
Z. Chang
Hao Lu
Bitao Yang
Zhuang Li
Liufang Guo
Zhecheng Wang
SSL
27
10
0
11 Jun 2021
FedNLP: An interpretable NLP System to Decode Federal Reserve
  Communications
FedNLP: An interpretable NLP System to Decode Federal Reserve Communications
Jean Lee
Hoyoul Luis Youn
Nicholas Stevens
Josiah Poon
S. Han
55
10
0
11 Jun 2021
Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis
  with Graph-based Multi-modal Context Modeling
Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis with Graph-based Multi-modal Context Modeling
Jingbei Li
Yi Meng
Chenyi Li
Zhiyong Wu
Helen Meng
Chao Weng
Jane Polak Scowcroft
93
24
0
11 Jun 2021
Sprachsynthese -- State-of-the-Art in englischer und deutscher Sprache
Sprachsynthese -- State-of-the-Art in englischer und deutscher Sprache
René Peinl
50
0
0
11 Jun 2021
CONDA: a CONtextual Dual-Annotated dataset for in-game toxicity
  understanding and detection
CONDA: a CONtextual Dual-Annotated dataset for in-game toxicity understanding and detection
H. Weld
Guanghao Huang
Jean Lee
Tongshu Zhang
Kunze Wang
Xinghong Guo
Siqu Long
Josiah Poon
S. Han
97
18
0
11 Jun 2021
MlTr: Multi-label Classification with Transformer
MlTr: Multi-label Classification with Transformer
Xingyi Cheng
Hezheng Lin
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Nian Shi
Honglin Liu
ViT
53
50
0
11 Jun 2021
BoB: BERT Over BERT for Training Persona-based Dialogue Models from
  Limited Personalized Data
BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data
Haoyu Song
Yan Wang
Kaiyan Zhang
Weinan Zhang
Ting Liu
65
123
0
11 Jun 2021
Assessing Political Prudence of Open-domain Chatbots
Assessing Political Prudence of Open-domain Chatbots
Yejin Bang
Nayeon Lee
Etsuko Ishii
Andrea Madotto
Pascale Fung
70
25
0
11 Jun 2021
Global Neighbor Sampling for Mixed CPU-GPU Training on Giant Graphs
Global Neighbor Sampling for Mixed CPU-GPU Training on Giant Graphs
Jialin Dong
Da Zheng
Lin F. Yang
Geroge Karypis
GNN
54
37
0
11 Jun 2021
Monotonic Neural Network: combining Deep Learning with Domain Knowledge
  for Chiller Plants Energy Optimization
Monotonic Neural Network: combining Deep Learning with Domain Knowledge for Chiller Plants Energy Optimization
Fanhe Ma
Faen Zhang
Shenglan Ben
Shuxin Qin
Pengcheng Zhou
Changsheng Zhou
Fengyi Xu
62
0
0
11 Jun 2021
DORO: Distributional and Outlier Robust Optimization
DORO: Distributional and Outlier Robust Optimization
Runtian Zhai
Chen Dan
J. Zico Kolter
Pradeep Ravikumar
55
62
0
11 Jun 2021
A comprehensive solution to retrieval-based chatbot construction
A comprehensive solution to retrieval-based chatbot construction
Kristen Moore
Shenjun Zhong
Zhen He
Torsten Rudolf
Nils Fisher
Brandon Victor
Neha Jindal
26
13
0
11 Jun 2021
ChemRL-GEM: Geometry Enhanced Molecular Representation Learning for
  Property Prediction
ChemRL-GEM: Geometry Enhanced Molecular Representation Learning for Property Prediction
Xiaomin Fang
Lihang Liu
Jieqiong Lei
Donglong He
Shanzhuo Zhang
Jingbo Zhou
Fan Wang
Hua Wu
Haifeng Wang
AI4CE
91
462
0
11 Jun 2021
Bridging Subword Gaps in Pretrain-Finetune Paradigm for Natural Language
  Generation
Bridging Subword Gaps in Pretrain-Finetune Paradigm for Natural Language Generation
Xin Liu
Baosong Yang
Dayiheng Liu
Haibo Zhang
Weihua Luo
Min Zhang
Haiying Zhang
Jinsong Su
63
18
0
11 Jun 2021
Causal Analysis of Syntactic Agreement Mechanisms in Neural Language
  Models
Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models
Matthew Finlayson
Aaron Mueller
Sebastian Gehrmann
Stuart M. Shieber
Tal Linzen
Yonatan Belinkov
141
110
0
10 Jun 2021
One Sense per Translation
One Sense per Translation
B. Hauer
Grzegorz Kondrak
73
1
0
10 Jun 2021
Rethinking Architecture Design for Tackling Data Heterogeneity in
  Federated Learning
Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning
Liangqiong Qu
Yuyin Zhou
Paul Pu Liang
Yingda Xia
Feifei Wang
Ehsan Adeli
L. Fei-Fei
D. Rubin
FedMLAI4CE
114
186
0
10 Jun 2021
Cross-lingual Emotion Detection
Cross-lingual Emotion Detection
Sabit Hassan
Shaden Shaar
Kareem Darwish
53
12
0
10 Jun 2021
CodemixedNLP: An Extensible and Open NLP Toolkit for Code-Mixing
CodemixedNLP: An Extensible and Open NLP Toolkit for Code-Mixing
Sai Muralidhar Jayanthi
Kavya Nerella
Khyathi Chandu
A. Black
MoE
73
8
0
10 Jun 2021
Space-time Mixing Attention for Video Transformer
Space-time Mixing Attention for Video Transformer
Adrian Bulat
Juan-Manuel Perez-Rua
Swathikiran Sudhakaran
Brais Martínez
Georgios Tzimiropoulos
ViT
95
127
0
10 Jun 2021
Scaling Vision with Sparse Mixture of Experts
Scaling Vision with Sparse Mixture of Experts
C. Riquelme
J. Puigcerver
Basil Mustafa
Maxim Neumann
Rodolphe Jenatton
André Susano Pinto
Daniel Keysers
N. Houlsby
MoE
154
613
0
10 Jun 2021
Synthesizing Adversarial Negative Responses for Robust Response Ranking
  and Evaluation
Synthesizing Adversarial Negative Responses for Robust Response Ranking and Evaluation
Prakhar Gupta
Yulia Tsvetkov
Jeffrey P. Bigham
86
23
0
10 Jun 2021
A Semi-supervised Multi-task Learning Approach to Classify Customer
  Contact Intents
A Semi-supervised Multi-task Learning Approach to Classify Customer Contact Intents
Li Dong
Matthew C. Spencer
Amir Biagi
54
3
0
10 Jun 2021
A Template-guided Hybrid Pointer Network for
  Knowledge-basedTask-oriented Dialogue Systems
A Template-guided Hybrid Pointer Network for Knowledge-basedTask-oriented Dialogue Systems
Dingmin Wang
Ziyao Chen
Wanwei He
Li Zhong
Yunzhe Tao
Min Yang
92
11
0
10 Jun 2021
Neural Text Classification and Stacked Heterogeneous Embeddings for
  Named Entity Recognition in SMM4H 2021
Neural Text Classification and Stacked Heterogeneous Embeddings for Named Entity Recognition in SMM4H 2021
Usama Yaseen
Stefan Langer
41
13
0
10 Jun 2021
GroupBERT: Enhanced Transformer Architecture with Efficient Grouped
  Structures
GroupBERT: Enhanced Transformer Architecture with Efficient Grouped Structures
Ivan Chelombiev
Daniel Justus
Douglas Orr
A. Dietrich
Frithjof Gressmann
A. Koliousis
Carlo Luschi
60
5
0
10 Jun 2021
CAT: Cross Attention in Vision Transformer
CAT: Cross Attention in Vision Transformer
Hezheng Lin
Xingyi Cheng
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Qing Song
Wei Yuan
ViT
67
158
0
10 Jun 2021
Programming Puzzles
Programming Puzzles
Tal Schuster
Ashwin Kalyan
Oleksandr Polozov
Adam Tauman Kalai
ELM
110
34
0
10 Jun 2021
Linguistically Informed Masking for Representation Learning in the
  Patent Domain
Linguistically Informed Masking for Representation Learning in the Patent Domain
Sophia Althammer
Mark Buckley
Sebastian Hofstatter
Allan Hanbury
59
11
0
10 Jun 2021
SemEval-2021 Task 11: NLPContributionGraph -- Structuring Scholarly NLP
  Contributions for a Research Knowledge Graph
SemEval-2021 Task 11: NLPContributionGraph -- Structuring Scholarly NLP Contributions for a Research Knowledge Graph
Jennifer D'Souza
Sören Auer
Ted Pedersen
94
32
0
10 Jun 2021
FEVEROUS: Fact Extraction and VERification Over Unstructured and
  Structured information
FEVEROUS: Fact Extraction and VERification Over Unstructured and Structured information
Rami Aly
Zhijiang Guo
Michael Schlichtkrull
James Thorne
Andreas Vlachos
Christos Christodoulopoulos
O. Cocarascu
Arpit Mittal
HILM
135
188
0
10 Jun 2021
AI-enabled Automation for Completeness Checking of Privacy Policies
AI-enabled Automation for Completeness Checking of Privacy Policies
Orlando Amaral
Sallam Abualhaija
Damiano Torre
M. Sabetzadeh
Lionel C. Briand
57
40
0
10 Jun 2021
Ruddit: Norms of Offensiveness for English Reddit Comments
Ruddit: Norms of Offensiveness for English Reddit Comments
Rishav Hada
S. Sudhir
Pushkar Mishra
H. Yannakoudakis
Saif M. Mohammad
Ekaterina Shutova
137
37
0
10 Jun 2021
MST: Masked Self-Supervised Transformer for Visual Representation
MST: Masked Self-Supervised Transformer for Visual Representation
Zhaowen Li
Zhiyang Chen
Fan Yang
Wei Li
Yousong Zhu
...
Rui Deng
Liwei Wu
Rui Zhao
Ming Tang
Jinqiao Wang
ViT
102
168
0
10 Jun 2021
MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training
MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training
Mingliang Zeng
Xu Tan
Rui Wang
Zeqian Ju
Tao Qin
Tie-Yan Liu
70
136
0
10 Jun 2021
VT-SSum: A Benchmark Dataset for Video Transcript Segmentation and
  Summarization
VT-SSum: A Benchmark Dataset for Video Transcript Segmentation and Summarization
Tengchao Lv
Lei Cui
M. Vasilijevic
Furu Wei
63
7
0
10 Jun 2021
Supervising the Transfer of Reasoning Patterns in VQA
Supervising the Transfer of Reasoning Patterns in VQA
Corentin Kervadec
Christian Wolf
G. Antipov
M. Baccouche
Madiha Nadri Wolf
79
11
0
10 Jun 2021
AUGNLG: Few-shot Natural Language Generation using Self-trained Data
  Augmentation
AUGNLG: Few-shot Natural Language Generation using Self-trained Data Augmentation
Xinnuo Xu
Guoyin Wang
Young-Bum Kim
Sungjin Lee
70
32
0
10 Jun 2021
Shades of BLEU, Flavours of Success: The Case of MultiWOZ
Shades of BLEU, Flavours of Success: The Case of MultiWOZ
Tomás Nekvinda
Ondrej Dusek
79
59
0
10 Jun 2021
How Robust are Model Rankings: A Leaderboard Customization Approach for
  Equitable Evaluation
How Robust are Model Rankings: A Leaderboard Customization Approach for Equitable Evaluation
Swaroop Mishra
Anjana Arunkumar
89
26
0
10 Jun 2021
Convolutions and Self-Attention: Re-interpreting Relative Positions in
  Pre-trained Language Models
Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models
Tyler A. Chang
Yifan Xu
Weijian Xu
Zhuowen Tu
ViT
59
15
0
10 Jun 2021
Semantic-aware Binary Code Representation with BERT
Semantic-aware Binary Code Representation with BERT
Hyungjoon Koo
Soyeon Park
Daejin Choi
Taesoo Kim
64
24
0
10 Jun 2021
Variational Information Bottleneck for Effective Low-Resource
  Fine-Tuning
Variational Information Bottleneck for Effective Low-Resource Fine-Tuning
Rabeeh Karimi Mahabadi
Yonatan Belinkov
James Henderson
DRL
76
76
0
10 Jun 2021
Low-Dimensional Structure in the Space of Language Representations is
  Reflected in Brain Responses
Low-Dimensional Structure in the Space of Language Representations is Reflected in Brain Responses
Richard Antonello
Javier S. Turek
Vy A. Vo
Alexander G. Huth
83
43
0
09 Jun 2021
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Mandela Patrick
Dylan Campbell
Yuki M. Asano
Ishan Misra
Ishan Misra Florian Metze
Christoph Feichtenhofer
Andrea Vedaldi
João F. Henriques
114
282
0
09 Jun 2021
Previous
123...326327328...472473474
Next