ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLM
    SSL
    SSeg
ArXivPDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 19,767 papers shown
Title
Align, Mask and Select: A Simple Method for Incorporating Commonsense
  Knowledge into Language Representation Models
Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models
Zhiquan Ye
Qian Chen
Wen Wang
Zhenhua Ling
27
68
0
19 Aug 2019
PrivFT: Private and Fast Text Classification with Homomorphic Encryption
PrivFT: Private and Fast Text Classification with Homomorphic Encryption
Ahmad Al Badawi
Louie Hoang
Chan Fook Mun
Kim Laine
Khin Mi Mi Aung
32
80
0
19 Aug 2019
RefNet: A Reference-aware Network for Background Based Conversation
RefNet: A Reference-aware Network for Background Based Conversation
Chuan Meng
Pengjie Ren
Zhumin Chen
Christof Monz
Jun Ma
Maarten de Rijke
26
61
0
18 Aug 2019
TDAM: a Topic-Dependent Attention Model for Sentiment Analysis
TDAM: a Topic-Dependent Attention Model for Sentiment Analysis
Gabriele Pergola
Lin Gui
Yulan He
38
57
0
18 Aug 2019
A Fast and Accurate One-Stage Approach to Visual Grounding
A Fast and Accurate One-Stage Approach to Visual Grounding
Zhengyuan Yang
Boqing Gong
Liwei Wang
Wenbing Huang
Dong Yu
Jiebo Luo
ObjD
24
361
0
18 Aug 2019
Build it Break it Fix it for Dialogue Safety: Robustness from
  Adversarial Human Attack
Build it Break it Fix it for Dialogue Safety: Robustness from Adversarial Human Attack
Emily Dinan
Samuel Humeau
Bharath Chintagunta
Jason Weston
29
244
0
17 Aug 2019
EmotionX-IDEA: Emotion BERT -- an Affectional Model for Conversation
EmotionX-IDEA: Emotion BERT -- an Affectional Model for Conversation
Yen-Hao Huang
Ssu-Rui Lee
Mau-Yun Ma
Yi-Hsin Chen
Ya-Wen Yu
Yi-Shin Chen
11
56
0
17 Aug 2019
A Symbolic Neural Network Representation and its Application to
  Understanding, Verifying, and Patching Networks
A Symbolic Neural Network Representation and its Application to Understanding, Verifying, and Patching Networks
Matthew Sotoudeh
Aditya V. Thakur
22
4
0
17 Aug 2019
CLUTRR: A Diagnostic Benchmark for Inductive Reasoning from Text
CLUTRR: A Diagnostic Benchmark for Inductive Reasoning from Text
Koustuv Sinha
Shagun Sodhani
Jin Dong
Joelle Pineau
William L. Hamilton
35
203
0
16 Aug 2019
Shallow Domain Adaptive Embeddings for Sentiment Analysis
Shallow Domain Adaptive Embeddings for Sentiment Analysis
P. Sarma
Yingyu Liang
W. Sethares
AI4CE
19
8
0
16 Aug 2019
Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal
  Pre-training
Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training
Gen Li
Nan Duan
Yuejian Fang
Ming Gong
Daxin Jiang
Ming Zhou
SSL
VLM
MLLM
96
895
0
16 Aug 2019
Simplify the Usage of Lexicon in Chinese NER
Simplify the Usage of Lexicon in Chinese NER
Ruotian Ma
Minlong Peng
Qi Zhang
Xuanjing Huang
13
260
0
16 Aug 2019
Attending to Future Tokens For Bidirectional Sequence Generation
Attending to Future Tokens For Bidirectional Sequence Generation
Carolin (Haas) Lawrence
Bhushan Kotnis
Mathias Niepert
30
35
0
16 Aug 2019
BERT-Based Multi-Head Selection for Joint Entity-Relation Extraction
BERT-Based Multi-Head Selection for Joint Entity-Relation Extraction
Weipéng Huáng
Xingyi Cheng
Taifeng Wang
Wei Chu
27
30
0
16 Aug 2019
Few-Shot Dialogue Generation Without Annotated Data: A Transfer Learning
  Approach
Few-Shot Dialogue Generation Without Annotated Data: A Transfer Learning Approach
Igor Shalyminov
Sungjin Lee
Arash Eshghi
Oliver Lemon
OffRL
13
23
0
16 Aug 2019
Reasoning Over Paragraph Effects in Situations
Reasoning Over Paragraph Effects in Situations
Kevin Lin
Oyvind Tafjord
Peter Clark
Matt Gardner
36
115
0
16 Aug 2019
A deep-learning-based surrogate model for data assimilation in dynamic
  subsurface flow problems
A deep-learning-based surrogate model for data assimilation in dynamic subsurface flow problems
Meng Tang
Yimin Liu
L. Durlofsky
AI4CE
37
257
0
16 Aug 2019
Integrating Multimodal Information in Large Pretrained Transformers
Integrating Multimodal Information in Large Pretrained Transformers
Wasifur Rahman
M. Hasan
Sangwu Lee
Amir Zadeh
Chengfeng Mao
Louis-Philippe Morency
Ehsan Hoque
19
29
0
15 Aug 2019
Abductive Commonsense Reasoning
Abductive Commonsense Reasoning
Chandra Bhagavatula
Ronan Le Bras
Chaitanya Malaviya
Keisuke Sakaguchi
Ari Holtzman
Hannah Rashkin
Doug Downey
Scott Yih
Yejin Choi
ReLM
LRM
25
454
0
15 Aug 2019
SenseBERT: Driving Some Sense into BERT
SenseBERT: Driving Some Sense into BERT
Yoav Levine
Barak Lenz
Or Dagan
Ori Ram
Dan Padnos
Or Sharir
Shai Shalev-Shwartz
Amnon Shashua
Y. Shoham
SSL
27
186
0
15 Aug 2019
Visualizing and Understanding the Effectiveness of BERT
Visualizing and Understanding the Effectiveness of BERT
Y. Hao
Li Dong
Furu Wei
Ke Xu
31
183
0
15 Aug 2019
A Multi-Turn Emotionally Engaging Dialog Model
A Multi-Turn Emotionally Engaging Dialog Model
Yubo Xie
Ekaterina Svikhnushina
P. Pu
26
15
0
15 Aug 2019
A Multi-Type Multi-Span Network for Reading Comprehension that Requires
  Discrete Reasoning
A Multi-Type Multi-Span Network for Reading Comprehension that Requires Discrete Reasoning
Minghao Hu
Yuxing Peng
Zhen Huang
Dongsheng Li
AIMat
LRM
32
91
0
15 Aug 2019
Multi-class Hierarchical Question Classification for Multiple Choice
  Science Exams
Multi-class Hierarchical Question Classification for Multiple Choice Science Exams
Dongfang Xu
Peter Alexander Jansen
Jaycie Martin
Zhengnan Xie
Vikas Yadav
Harish Tayyar Madabushi
Oyvind Tafjord
Peter Clark
24
23
0
15 Aug 2019
Temporal Collaborative Ranking Via Personalized Transformer
Temporal Collaborative Ranking Via Personalized Transformer
Liwei Wu
Shuqing Li
Cho-Jui Hsieh
James Sharpnack
AI4TS
29
4
0
15 Aug 2019
Sex Trafficking Detection with Ordinal Regression Neural Networks
Sex Trafficking Detection with Ordinal Regression Neural Networks
Longshaokan Wang
E. Laber
Yeng Saanchi
Sherrie Caltagirone
22
14
0
15 Aug 2019
Feature-Less End-to-End Nested Term Extraction
Feature-Less End-to-End Nested Term Extraction
Yuze Gao
Yu Yuan
31
15
0
15 Aug 2019
Towards Making the Most of BERT in Neural Machine Translation
Towards Making the Most of BERT in Neural Machine Translation
Jiacheng Yang
Mingxuan Wang
Hao Zhou
Chengqi Zhao
Yong Yu
Weinan Zhang
Lei Li
CLL
28
157
0
15 Aug 2019
Multi-Task Self-Supervised Learning for Disfluency Detection
Multi-Task Self-Supervised Learning for Disfluency Detection
Shaolei Wang
Wanxiang Che
Qi Liu
Pengda Qin
Ting Liu
William Yang Wang
SSL
27
56
0
15 Aug 2019
Towards Debiasing Fact Verification Models
Towards Debiasing Fact Verification Models
Tal Schuster
Darsh J. Shah
Yun Jie Serene Yeo
Daniel Filizzola
Enrico Santus
Regina Barzilay
62
209
0
14 Aug 2019
Scalable Attentive Sentence-Pair Modeling via Distilled Sentence
  Embedding
Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding
Oren Barkan
Noam Razin
Itzik Malkiel
Ori Katz
Avi Caciularu
Noam Koenigstein
FedML
49
37
0
14 Aug 2019
SG-Net: Syntax-Guided Machine Reading Comprehension
SG-Net: Syntax-Guided Machine Reading Comprehension
Zhuosheng Zhang
Yuwei Wu
Junru Zhou
Sufeng Duan
Hai Zhao
Rui Wang
47
187
0
14 Aug 2019
FlowDelta: Modeling Flow Information Gain in Reasoning for
  Conversational Machine Comprehension
FlowDelta: Modeling Flow Information Gain in Reasoning for Conversational Machine Comprehension
Yi-Ting Yeh
Yun-Nung Chen
34
40
0
14 Aug 2019
Fusion of Detected Objects in Text for Visual Question Answering
Fusion of Detected Objects in Text for Visual Question Answering
Chris Alberti
Jeffrey Ling
Michael Collins
David Reitter
19
173
0
14 Aug 2019
Unsupervised Out-of-Distribution Detection by Maximum Classifier
  Discrepancy
Unsupervised Out-of-Distribution Detection by Maximum Classifier Discrepancy
Qing Yu
Kiyoharu Aizawa
OODD
24
166
0
14 Aug 2019
Entity-aware ELMo: Learning Contextual Entity Representation for Entity
  Disambiguation
Entity-aware ELMo: Learning Contextual Entity Representation for Entity Disambiguation
Hamed Shahbazi
Xiaoli Z. Fern
Reza Ghaeini
Rasha Obeidat
Prasad Tadepalli
46
21
0
14 Aug 2019
Reinforcement Learning Based Graph-to-Sequence Model for Natural
  Question Generation
Reinforcement Learning Based Graph-to-Sequence Model for Natural Question Generation
Yu Chen
Lingfei Wu
Mohammed J Zaki
GNN
24
155
0
14 Aug 2019
HorNet: A Hierarchical Offshoot Recurrent Network for Improving Person
  Re-ID via Image Captioning
HorNet: A Hierarchical Offshoot Recurrent Network for Improving Person Re-ID via Image Captioning
Shiyang Yan
Jun Xu
Yuai Liu
Lin Xu
32
7
0
14 Aug 2019
Fine-grained Information Status Classification Using Discourse
  Context-Aware Self-Attention
Fine-grained Information Status Classification Using Discourse Context-Aware Self-Attention
Yufang Hou
27
0
0
13 Aug 2019
StructBERT: Incorporating Language Structures into Pre-training for Deep
  Language Understanding
StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding
Wei Wang
Bin Bi
Ming Yan
Chen Henry Wu
Zuyi Bao
Jiangnan Xia
Liwei Peng
Luo Si
31
260
0
13 Aug 2019
Generative Question Refinement with Deep Reinforcement Learning in
  Retrieval-based QA System
Generative Question Refinement with Deep Reinforcement Learning in Retrieval-based QA System
Ye Liu
Chenwei Zhang
Xiaohui Yan
Yi-Ju Chang
Philip S. Yu
35
19
0
13 Aug 2019
On Identifiability in Transformers
On Identifiability in Transformers
Gino Brunner
Yang Liu
Damian Pascual
Oliver Richter
Massimiliano Ciaramita
Roger Wattenhofer
ViT
30
188
0
12 Aug 2019
Taming Unbalanced Training Workloads in Deep Learning with Partial
  Collective Operations
Taming Unbalanced Training Workloads in Deep Learning with Partial Collective Operations
Shigang Li
Tal Ben-Nun
Salvatore Di Girolamo
Dan Alistarh
Torsten Hoefler
24
58
0
12 Aug 2019
Multimodal Unified Attention Networks for Vision-and-Language
  Interactions
Multimodal Unified Attention Networks for Vision-and-Language Interactions
Zhou Yu
Yuhao Cui
Jun Yu
Dacheng Tao
Q. Tian
32
38
0
12 Aug 2019
TAPER: Time-Aware Patient EHR Representation
TAPER: Time-Aware Patient EHR Representation
Sajad Darabi
Mohammad Kachuee
Shayan Fazeli
Majid Sarrafzadeh
27
56
0
11 Aug 2019
Exploiting Temporal Relationships in Video Moment Localization with
  Natural Language
Exploiting Temporal Relationships in Video Moment Localization with Natural Language
Songyang Zhang
Jinsong Su
Jiebo Luo
19
74
0
11 Aug 2019
Multi-modality Latent Interaction Network for Visual Question Answering
Multi-modality Latent Interaction Network for Visual Question Answering
Peng Gao
Haoxuan You
Zhanpeng Zhang
Xiaogang Wang
Hongsheng Li
38
82
0
10 Aug 2019
VisualBERT: A Simple and Performant Baseline for Vision and Language
VisualBERT: A Simple and Performant Baseline for Vision and Language
Liunian Harold Li
Mark Yatskar
Da Yin
Cho-Jui Hsieh
Kai-Wei Chang
VLM
84
1,929
0
09 Aug 2019
BERT-based Ranking for Biomedical Entity Normalization
BERT-based Ranking for Biomedical Entity Normalization
Zongcheng Ji
Qiang Wei
Hua Xu
OOD
MedIm
24
122
0
09 Aug 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for
  Vision-and-Language Tasks
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSL
VLM
149
3,650
0
06 Aug 2019
Previous
123...385386387...394395396
Next