ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 23,641 papers shown
Title
cs60075_team2 at SemEval-2021 Task 1 : Lexical Complexity Prediction
  using Transformer-based Language Models pre-trained on various text corpora
cs60075_team2 at SemEval-2021 Task 1 : Lexical Complexity Prediction using Transformer-based Language Models pre-trained on various text corpora
Abhilash Nandy
Sayantan Adak
Tanurima Halder
Sai Mahesh Pokala
33
6
0
04 Jun 2021
Bi-Granularity Contrastive Learning for Post-Training in Few-Shot Scene
Bi-Granularity Contrastive Learning for Post-Training in Few-Shot Scene
Ruikun Luo
Guanhuan Huang
Xiaojun Quan
CLL
88
10
0
04 Jun 2021
Few-Shot Segmentation via Cycle-Consistent Transformer
Few-Shot Segmentation via Cycle-Consistent Transformer
Gengwei Zhang
Guoliang Kang
Yi Yang
Yunchao Wei
ViT
107
187
0
04 Jun 2021
AdaTag: Multi-Attribute Value Extraction from Product Profiles with
  Adaptive Decoding
AdaTag: Multi-Attribute Value Extraction from Product Profiles with Adaptive Decoding
Jun Yan
Nasser Zalmout
Yan Liang
Christan Earl Grant
Xiang Ren
Xin Luna Dong
65
55
0
04 Jun 2021
Retrieve & Memorize: Dialog Policy Learning with Multi-Action Memory
Retrieve & Memorize: Dialog Policy Learning with Multi-Action Memory
Yunhao Li
Yunyi Yang
Xiaojun Quan
Jianxing Yu
RALM
79
8
0
04 Jun 2021
AdvPicker: Effectively Leveraging Unlabeled Data via Adversarial
  Discriminator for Cross-Lingual NER
AdvPicker: Effectively Leveraging Unlabeled Data via Adversarial Discriminator for Cross-Lingual NER
Weile Chen
Huiqiang Jiang
Qianhui Wu
Börje F. Karlsson
Yingjun Guan
82
35
0
04 Jun 2021
Human-Adversarial Visual Question Answering
Human-Adversarial Visual Question Answering
Sasha Sheng
Amanpreet Singh
Vedanuj Goswami
Jose Alberto Lopez Magana
Wojciech Galuba
Devi Parikh
Douwe Kiela
OODEgoVAAML
58
63
0
04 Jun 2021
SAND-mask: An Enhanced Gradient Masking Strategy for the Discovery of
  Invariances in Domain Generalization
SAND-mask: An Enhanced Gradient Masking Strategy for the Discovery of Invariances in Domain Generalization
Soroosh Shahtalebi
Jean-Christophe Gagnon-Audet
Touraj Laleh
Mojtaba Faramarzi
Kartik Ahuja
Irina Rish
89
61
0
04 Jun 2021
Visual Question Rewriting for Increasing Response Rate
Visual Question Rewriting for Increasing Response Rate
Jiayi Wei
Xilian Li
Yi Zhang
Xin Eric Wang
56
3
0
04 Jun 2021
FedBABU: Towards Enhanced Representation for Federated Image
  Classification
FedBABU: Towards Enhanced Representation for Federated Image Classification
Jaehoon Oh
Sangmook Kim
Se-Young Yun
FedML
128
204
0
04 Jun 2021
Scalable Transformers for Neural Machine Translation
Scalable Transformers for Neural Machine Translation
Peng Gao
Shijie Geng
Ping Luo
Xiaogang Wang
Jifeng Dai
Hongsheng Li
118
13
0
04 Jun 2021
ERNIE-Tiny : A Progressive Distillation Framework for Pretrained
  Transformer Compression
ERNIE-Tiny : A Progressive Distillation Framework for Pretrained Transformer Compression
Weiyue Su
Xuyi Chen
Shi Feng
Jiaxiang Liu
Weixin Liu
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
83
13
0
04 Jun 2021
Language Scaling for Universal Suggested Replies Model
Language Scaling for Universal Suggested Replies Model
Qianlan Ying
Payal Bajaj
Budhaditya Deb
Yu Yang
Wei Wang
Bojia Lin
Milad Shokouhi
Xia Song
Yang Yang
Daxin Jiang
LRM
58
2
0
04 Jun 2021
Addressing Inquiries about History: An Efficient and Practical Framework
  for Evaluating Open-domain Chatbot Consistency
Addressing Inquiries about History: An Efficient and Practical Framework for Evaluating Open-domain Chatbot Consistency
Zekang Li
Jinchao Zhang
Zhengcong Fei
Yang Feng
Jie Zhou
58
14
0
04 Jun 2021
Conversations Are Not Flat: Modeling the Dynamic Information Flow across
  Dialogue Utterances
Conversations Are Not Flat: Modeling the Dynamic Information Flow across Dialogue Utterances
Zekang Li
Jinchao Zhang
Zhengcong Fei
Yang Feng
Jie Zhou
66
57
0
04 Jun 2021
BERTTune: Fine-Tuning Neural Machine Translation with BERTScore
BERTTune: Fine-Tuning Neural Machine Translation with BERTScore
Inigo Jauregi Unanue
Jacob Parnell
Massimo Piccardi
58
34
0
04 Jun 2021
Self-supervised Dialogue Learning for Spoken Conversational Question
  Answering
Self-supervised Dialogue Learning for Spoken Conversational Question Answering
Nuo Chen
Chenyu You
Yuexian Zou
SSL
93
34
0
04 Jun 2021
nmT5 -- Is parallel data still relevant for pre-training massively
  multilingual language models?
nmT5 -- Is parallel data still relevant for pre-training massively multilingual language models?
Mihir Kale
Aditya Siddhant
Noah Constant
Melvin Johnson
Rami Al-Rfou
Linting Xue
LRM
70
25
0
03 Jun 2021
Syntax-augmented Multilingual BERT for Cross-lingual Transfer
Syntax-augmented Multilingual BERT for Cross-lingual Transfer
Wasi Uddin Ahmad
Haoran Li
Kai-Wei Chang
Yashar Mehdad
71
34
0
03 Jun 2021
Language Embeddings for Typology and Cross-lingual Transfer Learning
Language Embeddings for Typology and Cross-lingual Transfer Learning
Dian Yu
Taiqi He
Kenji Sagae
74
12
0
03 Jun 2021
Anticipative Video Transformer
Anticipative Video Transformer
Rohit Girdhar
Kristen Grauman
ViT
96
212
0
03 Jun 2021
A Dataset and Baselines for Multilingual Reply Suggestion
A Dataset and Baselines for Multilingual Reply Suggestion
Mozhi Zhang
Wei Wang
Budhaditya Deb
Guoqing Zheng
Milad Shokouhi
Ahmed Hassan Awadallah
LRM
56
8
0
03 Jun 2021
CCPM: A Chinese Classical Poetry Matching Dataset
CCPM: A Chinese Classical Poetry Matching Dataset
Wenhao Li
Fanchao Qi
Maosong Sun
Xiaoyuan Yi
Jiarui Zhang
57
11
0
03 Jun 2021
Defending Democracy: Using Deep Learning to Identify and Prevent
  Misinformation
Defending Democracy: Using Deep Learning to Identify and Prevent Misinformation
Anusua Trivedi
Alyssa Suhm
Prathamesh Mahankal
Subhiksha Mukuntharaj
Meghana D. Parab
Malvika Mohan
Meredith Berger
Arathi Sethumadhavan
A. Jaiman
Rahul Dodhia
117
0
0
03 Jun 2021
SOCCER: An Information-Sparse Discourse State Tracking Collection in the
  Sports Commentary Domain
SOCCER: An Information-Sparse Discourse State Tracking Collection in the Sports Commentary Domain
Ruochen Zhang
Carsten Eickhoff
47
7
0
03 Jun 2021
The Case for Translation-Invariant Self-Attention in Transformer-Based
  Language Models
The Case for Translation-Invariant Self-Attention in Transformer-Based Language Models
Ulme Wennberg
G. Henter
MILM
95
22
0
03 Jun 2021
Representing Syntax and Composition with Geometric Transformations
Representing Syntax and Composition with Geometric Transformations
Lorenzo Bertolini
Julie Weeds
David J. Weir
Qiwei Peng
57
2
0
03 Jun 2021
Defending Against Backdoor Attacks in Natural Language Generation
Defending Against Backdoor Attacks in Natural Language Generation
Xiaofei Sun
Xiaoya Li
Yuxian Meng
Xiang Ao
Leilei Gan
Jiwei Li
Tianwei Zhang
AAMLSILM
103
52
0
03 Jun 2021
E2E-VLP: End-to-End Vision-Language Pre-training Enhanced by Visual
  Learning
E2E-VLP: End-to-End Vision-Language Pre-training Enhanced by Visual Learning
Haiyang Xu
Ming Yan
Chenliang Li
Bin Bi
Songfang Huang
Wenming Xiao
Fei Huang
VLM
126
119
0
03 Jun 2021
TVDIM: Enhancing Image Self-Supervised Pretraining via Noisy Text Data
TVDIM: Enhancing Image Self-Supervised Pretraining via Noisy Text Data
Pengda Qin
Yuhong Li
Kefeng Deng
Qiang Wu
30
1
0
03 Jun 2021
Template-Based Named Entity Recognition Using BART
Template-Based Named Entity Recognition Using BART
Leyang Cui
Yu Wu
Jian Liu
Sen Yang
Yue Zhang
97
356
0
03 Jun 2021
Reordering Examples Helps during Priming-based Few-Shot Learning
Reordering Examples Helps during Priming-based Few-Shot Learning
Sawan Kumar
Partha P. Talukdar
85
58
0
03 Jun 2021
Auto-tagging of Short Conversational Sentences using Transformer Methods
Auto-tagging of Short Conversational Sentences using Transformer Methods
D. E. Tasar
¸Sükrü Ozan
Umut Özdil
M. Akca
Oguzhan Ölmez
Semih Gülüm
Seçilay Kutal
Ceren Belhan
29
4
0
03 Jun 2021
Attention mechanisms and deep learning for machine vision: A survey of
  the state of the art
Attention mechanisms and deep learning for machine vision: A survey of the state of the art
A. M. Hafiz
S. A. Parah
R. A. Bhat
101
45
0
03 Jun 2021
SIRE: Separate Intra- and Inter-sentential Reasoning for Document-level
  Relation Extraction
SIRE: Separate Intra- and Inter-sentential Reasoning for Document-level Relation Extraction
Shuang Zeng
Yuting Wu
Baobao Chang
124
77
0
03 Jun 2021
Improving Event Causality Identification via Self-Supervised
  Representation Learning on External Causal Statement
Improving Event Causality Identification via Self-Supervised Representation Learning on External Causal Statement
Xinyu Zuo
Pengfei Cao
Yubo Chen
Kang Liu
Jun Zhao
Weihua Peng
Yuguang Chen
65
52
0
03 Jun 2021
LearnDA: Learnable Knowledge-Guided Data Augmentation for Event
  Causality Identification
LearnDA: Learnable Knowledge-Guided Data Augmentation for Event Causality Identification
Xinyu Zuo
Pengfei Cao
Yubo Chen
Kang Liu
Jun Zhao
Weihua Peng
Yuguang Chen
94
53
0
03 Jun 2021
Generate, Prune, Select: A Pipeline for Counterspeech Generation against
  Online Hate Speech
Generate, Prune, Select: A Pipeline for Counterspeech Generation against Online Hate Speech
Wanzheng Zhu
S. Bhat
68
58
0
03 Jun 2021
Few-shot Knowledge Graph-to-Text Generation with Pretrained Language
  Models
Few-shot Knowledge Graph-to-Text Generation with Pretrained Language Models
Junyi Li
Tianyi Tang
Wayne Xin Zhao
Zhicheng Wei
N. Yuan
Ji-Rong Wen
79
49
0
03 Jun 2021
Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese
  Grammatical Error Correction
Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical Error Correction
Piji Li
Shuming Shi
AI4TS
93
35
0
03 Jun 2021
Self-Guided Contrastive Learning for BERT Sentence Representations
Self-Guided Contrastive Learning for BERT Sentence Representations
Taeuk Kim
Kang Min Yoo
Sang-goo Lee
SSL
111
205
0
03 Jun 2021
Automatically Detecting Cyberbullying Comments on Online Game Forums
Automatically Detecting Cyberbullying Comments on Online Game Forums
Hanh Hong-Phuc Vo
H. Tran
Son T. Luu
34
10
0
03 Jun 2021
Data-Driven Design-by-Analogy: State of the Art and Future Directions
Data-Driven Design-by-Analogy: State of the Art and Future Directions
Shuo Jiang
Jie Hu
Kristin L. Wood
Jianxi Luo
78
54
0
03 Jun 2021
The Limitations of Limited Context for Constituency Parsing
The Limitations of Limited Context for Constituency Parsing
Yuchen Li
Andrej Risteski
61
7
0
03 Jun 2021
Discriminative Reasoning for Document-level Relation Extraction
Discriminative Reasoning for Document-level Relation Extraction
Wang Xu
Kehai Chen
Tiejun Zhao
145
62
0
03 Jun 2021
Can Generative Pre-trained Language Models Serve as Knowledge Bases for
  Closed-book QA?
Can Generative Pre-trained Language Models Serve as Knowledge Bases for Closed-book QA?
Cunxiang Wang
Pai Liu
Yue Zhang
RALM
106
84
0
03 Jun 2021
Adjacency List Oriented Relational Fact Extraction via Adaptive
  Multi-task Learning
Adjacency List Oriented Relational Fact Extraction via Adaptive Multi-task Learning
Fubang Zhao
Zhuoren Jiang
Yangyang Kang
Changlong Sun
Xiaozhong Liu
33
9
0
03 Jun 2021
Comparing Acoustic-based Approaches for Alzheimer's Disease Detection
Comparing Acoustic-based Approaches for Alzheimer's Disease Detection
Aparna Balagopalan
Jekaterina Novikova
55
43
0
03 Jun 2021
When Vision Transformers Outperform ResNets without Pre-training or
  Strong Data Augmentations
When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations
Xiangning Chen
Cho-Jui Hsieh
Boqing Gong
ViT
117
330
0
03 Jun 2021
MPC-BERT: A Pre-Trained Language Model for Multi-Party Conversation
  Understanding
MPC-BERT: A Pre-Trained Language Model for Multi-Party Conversation Understanding
Jia-Chen Gu
Chongyang Tao
Zhenhua Ling
Can Xu
Xiubo Geng
Daxin Jiang
77
56
0
03 Jun 2021
Previous
123...329330331...471472473
Next