ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 23,639 papers shown
Title
The Out-of-Distribution Problem in Explainability and Search Methods for
  Feature Importance Explanations
The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations
Peter Hase
Harry Xie
Joey Tianyi Zhou
OODDLRMFAtt
134
91
0
01 Jun 2021
VILA: Improving Structured Content Extraction from Scientific PDFs Using
  Visual Layout Groups
VILA: Improving Structured Content Extraction from Scientific PDFs Using Visual Layout Groups
Zejiang Shen
Kyle Lo
Lucy Lu Wang
Bailey Kuehl
Daniel S. Weld
Doug Downey
VLM
120
36
0
01 Jun 2021
What Can I Do Here? Learning New Skills by Imagining Visual Affordances
What Can I Do Here? Learning New Skills by Imagining Visual Affordances
Alexander Khazatsky
Ashvin Nair
Dan Jing
Sergey Levine
LM&Ro
84
33
0
01 Jun 2021
You Only Look at One Sequence: Rethinking Transformer in Vision through
  Object Detection
You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection
Yuxin Fang
Bencheng Liao
Xinggang Wang
Jiemin Fang
Jiyang Qi
Rui Wu
Jianwei Niu
Wenyu Liu
ViT
80
326
0
01 Jun 2021
SpanNER: Named Entity Re-/Recognition as Span Prediction
SpanNER: Named Entity Re-/Recognition as Span Prediction
Jinlan Fu
Xuanjing Huang
Pengfei Liu
75
101
0
01 Jun 2021
Using Integrated Gradients and Constituency Parse Trees to explain
  Linguistic Acceptability learnt by BERT
Using Integrated Gradients and Constituency Parse Trees to explain Linguistic Acceptability learnt by BERT
Anmol Nayak
Hariprasad Timmapathini
58
5
0
01 Jun 2021
CIDER: Commonsense Inference for Dialogue Explanation and Reasoning
CIDER: Commonsense Inference for Dialogue Explanation and Reasoning
Deepanway Ghosal
Pengfei Hong
Siqi Shen
Navonil Majumder
Rada Mihalcea
Soujanya Poria
88
23
0
01 Jun 2021
Towards Quantifiable Dialogue Coherence Evaluation
Towards Quantifiable Dialogue Coherence Evaluation
Zheng Ye
Liucun Lu
Lishan Huang
Liang Lin
Xiaodan Liang
76
31
0
01 Jun 2021
THG: Transformer with Hyperbolic Geometry
THG: Transformer with Hyperbolic Geometry
Zhe Liu
Yibin Xu
ViT
43
1
0
01 Jun 2021
DoT: An efficient Double Transformer for NLP tasks with tables
DoT: An efficient Double Transformer for NLP tasks with tables
Syrine Krichene
Thomas Müller
Julian Martin Eisenschlos
73
14
0
01 Jun 2021
KGPool: Dynamic Knowledge Graph Context Selection for Relation
  Extraction
KGPool: Dynamic Knowledge Graph Context Selection for Relation Extraction
Abhishek Nadgeri
Anson Bastos
Kuldeep Singh
I. Mulang'
Johannes Hoffart
Saeedeh Shekarpour
V. Saraswat
SLR
53
34
0
01 Jun 2021
Dialogue-oriented Pre-training
Dialogue-oriented Pre-training
Yi Xu
Hai Zhao
80
14
0
01 Jun 2021
Nora: The Well-Being Coach
Nora: The Well-Being Coach
Genta Indra Winata
Holy Lovenia
Etsuko Ishii
Farhad Bin Siddique
Yongsheng Yang
Pascale Fung
33
3
0
01 Jun 2021
Towards Efficient Cross-Modal Visual Textual Retrieval using
  Transformer-Encoder Deep Features
Towards Efficient Cross-Modal Visual Textual Retrieval using Transformer-Encoder Deep Features
Nicola Messina
Giuseppe Amato
Fabrizio Falchi
Claudio Gennaro
Stéphane Marchand-Maillet
39
7
0
01 Jun 2021
Distribution Matching for Rationalization
Distribution Matching for Rationalization
Yongfeng Huang
Yujun Chen
Yulun Du
Zhilin Yang
OOD
67
18
0
01 Jun 2021
Is it a click bait? Let's predict using Machine Learning
Is it a click bait? Let's predict using Machine Learning
Sohom Ghosh
41
1
0
01 Jun 2021
Preview, Attend and Review: Schema-Aware Curriculum Learning for
  Multi-Domain Dialog State Tracking
Preview, Attend and Review: Schema-Aware Curriculum Learning for Multi-Domain Dialog State Tracking
Yinpei Dai
Hangyu Li
Yongbin Li
Jian Sun
Fei Huang
Luo Si
Xiao-Dan Zhu
92
53
0
01 Jun 2021
Improving the Adversarial Robustness for Speaker Verification by
  Self-Supervised Learning
Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning
Haibin Wu
Xu Li
Andy T. Liu
Zhiyong Wu
Helen Meng
Hung-yi Lee
AAMLSSL
118
30
0
01 Jun 2021
Volta at SemEval-2021 Task 9: Statement Verification and Evidence
  Finding with Tables using TAPAS and Transfer Learning
Volta at SemEval-2021 Task 9: Statement Verification and Evidence Finding with Tables using TAPAS and Transfer Learning
Devansh Gautam
Kshitij Gupta
Manish Shrivastava
LMTD
56
6
0
01 Jun 2021
Adversarial VQA: A New Benchmark for Evaluating the Robustness of VQA
  Models
Adversarial VQA: A New Benchmark for Evaluating the Robustness of VQA Models
Linjie Li
Jie Lei
Zhe Gan
Jingjing Liu
AAMLVLM
116
75
0
01 Jun 2021
Reinforced Iterative Knowledge Distillation for Cross-Lingual Named
  Entity Recognition
Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition
Shining Liang
Ming Gong
J. Pei
Linjun Shou
Wanli Zuo
Xianglin Zuo
Daxin Jiang
98
34
0
01 Jun 2021
Concurrent Adversarial Learning for Large-Batch Training
Concurrent Adversarial Learning for Large-Batch Training
Yong Liu
Xiangning Chen
Minhao Cheng
Cho-Jui Hsieh
Yang You
ODL
90
13
0
01 Jun 2021
Improving Formality Style Transfer with Context-Aware Rule Injection
Improving Formality Style Transfer with Context-Aware Rule Injection
Zonghai Yao
Hong-ye Yu
89
17
0
01 Jun 2021
Iterative Hierarchical Attention for Answering Complex Questions over
  Long Documents
Iterative Hierarchical Attention for Answering Complex Questions over Long Documents
Haitian Sun
William W. Cohen
Ruslan Salakhutdinov
151
13
0
01 Jun 2021
PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D
  World
PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World
Rowan Zellers
Ari Holtzman
Matthew E. Peters
Roozbeh Mottaghi
Aniruddha Kembhavi
Ali Farhadi
Yejin Choi
110
69
0
01 Jun 2021
HERALD: An Annotation Efficient Method to Detect User Disengagement in
  Social Conversations
HERALD: An Annotation Efficient Method to Detect User Disengagement in Social Conversations
Weixin Liang
Kai-Hui Liang
Zhou Yu
76
15
0
01 Jun 2021
HiddenCut: Simple Data Augmentation for Natural Language Understanding
  with Better Generalization
HiddenCut: Simple Data Augmentation for Natural Language Understanding with Better Generalization
Jiaao Chen
Dinghan Shen
Weizhu Chen
Diyi Yang
BDL
79
48
0
31 May 2021
Corpus-Based Paraphrase Detection Experiments and Review
Corpus-Based Paraphrase Detection Experiments and Review
T. Vrbanec
A. Meštrović
129
31
0
31 May 2021
An Exploratory Analysis of Multilingual Word-Level Quality Estimation
  with Cross-Lingual Transformers
An Exploratory Analysis of Multilingual Word-Level Quality Estimation with Cross-Lingual Transformers
Tharindu Ranasinghe
Constantin Orasan
R. Mitkov
75
28
0
31 May 2021
Training ELECTRA Augmented with Multi-word Selection
Training ELECTRA Augmented with Multi-word Selection
Jiaming Shen
Jialu Liu
Tianqi Liu
Cong Yu
Jiawei Han
90
9
0
31 May 2021
AppBuddy: Learning to Accomplish Tasks in Mobile Apps via Reinforcement
  Learning
AppBuddy: Learning to Accomplish Tasks in Mobile Apps via Reinforcement Learning
Maayan Shvo
Zhiming Hu
Rodrigo Toro Icarte
Iqbal Mohomed
A. Jepson
Sheila A. McIlraith
99
14
0
31 May 2021
Language Model Evaluation Beyond Perplexity
Language Model Evaluation Beyond Perplexity
Clara Meister
Ryan Cotterell
155
77
0
31 May 2021
How transfer learning impacts linguistic knowledge in deep NLP models?
How transfer learning impacts linguistic knowledge in deep NLP models?
Nadir Durrani
Hassan Sajjad
Fahim Dalvi
45
51
0
31 May 2021
MSG-Transformer: Exchanging Local Spatial Information by Manipulating
  Messenger Tokens
MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens
Jiemin Fang
Lingxi Xie
Xinggang Wang
Xiaopeng Zhang
Wenyu Liu
Qi Tian
ViT
75
78
0
31 May 2021
Toward Understanding the Feature Learning Process of Self-supervised
  Contrastive Learning
Toward Understanding the Feature Learning Process of Self-supervised Contrastive Learning
Zixin Wen
Yuanzhi Li
SSLMLT
98
136
0
31 May 2021
Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model
Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model
Jiangning Zhang
Chao Xu
Jian Li
Wenzhou Chen
Yabiao Wang
Ying Tai
Shuo Chen
Chengjie Wang
Feiyue Huang
Yong Liu
108
22
0
31 May 2021
M6-T: Exploring Sparse Expert Models and Beyond
M6-T: Exploring Sparse Expert Models and Beyond
An Yang
Junyang Lin
Rui Men
Chang Zhou
Le Jiang
...
Dingyang Zhang
Wei Lin
Lin Qu
Jingren Zhou
Hongxia Yang
MoE
128
24
0
31 May 2021
Can Attention Enable MLPs To Catch Up With CNNs?
Can Attention Enable MLPs To Catch Up With CNNs?
Meng-Hao Guo
Zheng-Ning Liu
Tai-Jiang Mu
Dun Liang
Ralph Robert Martin
Shimin Hu
AAML
77
17
0
31 May 2021
Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient
  Image Recognition
Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image Recognition
Yulin Wang
Rui Huang
S. Song
Zeyi Huang
Gao Huang
ViT
119
194
0
31 May 2021
Telling Stories through Multi-User Dialogue by Modeling Character
  Relations
Telling Stories through Multi-User Dialogue by Modeling Character Relations
Waiman Si
Prithviraj Ammanabrolu
Mark O. Riedl
58
15
0
31 May 2021
Counterfactual Invariance to Spurious Correlations: Why and How to Pass
  Stress Tests
Counterfactual Invariance to Spurious Correlations: Why and How to Pass Stress Tests
Victor Veitch
Alexander DÁmour
Steve Yadlowsky
Jacob Eisenstein
OOD
91
94
0
31 May 2021
Choose a Transformer: Fourier or Galerkin
Choose a Transformer: Fourier or Galerkin
Shuhao Cao
94
256
0
31 May 2021
Retweet communities reveal the main sources of hate speech
Retweet communities reveal the main sources of hate speech
Bojan Evkoski
Andraz Pelicon
I. Mozetič
Nikola Ljubesic
Petra Kralj Novak
38
20
0
31 May 2021
Connecting Language and Vision for Natural Language-Based Vehicle
  Retrieval
Connecting Language and Vision for Natural Language-Based Vehicle Retrieval
Shuai Bai
Zhedong Zheng
Xiaohan Wang
Junyang Lin
Zhu Zhang
Chang Zhou
Yi Yang
Hongxia Yang
103
27
0
31 May 2021
SemEval-2021 Task 4: Reading Comprehension of Abstract Meaning
SemEval-2021 Task 4: Reading Comprehension of Abstract Meaning
Boyuan Zheng
Xiaoyu Yang
Yu-Ping Ruan
Zhen-Hua Ling
Quan Liu
Si Wei
Xiao-Dan Zhu
ELM
48
13
0
31 May 2021
Q-attention: Enabling Efficient Learning for Vision-based Robotic
  Manipulation
Q-attention: Enabling Efficient Learning for Vision-based Robotic Manipulation
Stephen James
Andrew J. Davison
96
129
0
31 May 2021
Effective Batching for Recurrent Neural Network Grammars
Effective Batching for Recurrent Neural Network Grammars
Hiroshi Noji
Yohei Oseki
GNN
81
17
0
31 May 2021
Supporting Cognitive and Emotional Empathic Writing of Students
Supporting Cognitive and Emotional Empathic Writing of Students
Thiemo Wambsganss
C. Niklaus
Matthias Söllner
Siegfried Handschuh
J. Leimeister
75
27
0
31 May 2021
Exploration and Exploitation: Two Ways to Improve Chinese Spelling
  Correction Models
Exploration and Exploitation: Two Ways to Improve Chinese Spelling Correction Models
Chong Li
Ce Zhang
Xiaoqing Zheng
Xuanjing Huang
69
29
0
31 May 2021
Transfer Learning for Sequence Generation: from Single-source to
  Multi-source
Transfer Learning for Sequence Generation: from Single-source to Multi-source
Xuancheng Huang
Jingfang Xu
Maosong Sun
Yang Liu
61
5
0
31 May 2021
Previous
123...331332333...471472473
Next