ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 23,885 papers shown
Title
How Does Adversarial Fine-Tuning Benefit BERT?
How Does Adversarial Fine-Tuning Benefit BERT?
J. Ebrahimi
Hao Yang
Wei Zhang
AAML
63
4
0
31 Aug 2021
T3-Vis: a visual analytic framework for Training and fine-Tuning
  Transformers in NLP
T3-Vis: a visual analytic framework for Training and fine-Tuning Transformers in NLP
Raymond Li
Wen Xiao
Lanjun Wang
Hyeju Jang
Giuseppe Carenini
ViT
90
23
0
31 Aug 2021
AraT5: Text-to-Text Transformers for Arabic Language Generation
AraT5: Text-to-Text Transformers for Arabic Language Generation
El Moatez Billah Nagoudi
AbdelRahim Elmadany
Muhammad Abdul-Mageed
144
126
0
31 Aug 2021
Towards Consistent Document-level Entity Linking: Joint Models for
  Entity Linking and Coreference Resolution
Towards Consistent Document-level Entity Linking: Joint Models for Entity Linking and Coreference Resolution
Klim Zaporojets
Johannes Deleu
Yiwei Jiang
Thomas Demeester
Chris Develder
148
10
0
30 Aug 2021
Want To Reduce Labeling Cost? GPT-3 Can Help
Want To Reduce Labeling Cost? GPT-3 Can Help
Shuohang Wang
Yang Liu
Yichong Xu
Chenguang Zhu
Michael Zeng
95
257
0
30 Aug 2021
Improving Query Representations for Dense Retrieval with Pseudo
  Relevance Feedback
Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback
HongChien Yu
Chenyan Xiong
Jamie Callan
RALM
71
71
0
30 Aug 2021
DNNFusion: Accelerating Deep Neural Networks Execution with Advanced
  Operator Fusion
DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion
Wei Niu
Jiexiong Guan
Yanzhi Wang
G. Agrawal
Bin Ren
AI4CE
85
153
0
30 Aug 2021
N24News: A New Dataset for Multimodal News Classification
N24News: A New Dataset for Multimodal News Classification
Zhen Wang
Xu Shan
Xiangxie Zhang
Jie Yang
VLM
108
38
0
30 Aug 2021
FedKD: Communication Efficient Federated Learning via Knowledge
  Distillation
FedKD: Communication Efficient Federated Learning via Knowledge Distillation
Chuhan Wu
Fangzhao Wu
Lingjuan Lyu
Yongfeng Huang
Xing Xie
FedML
113
394
0
30 Aug 2021
AEDA: An Easier Data Augmentation Technique for Text Classification
AEDA: An Easier Data Augmentation Technique for Text Classification
Akbar Karimi
L. Rossi
Andrea Prati
98
158
0
30 Aug 2021
Differentiable Prompt Makes Pre-trained Language Models Better Few-shot
  Learners
Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners
Ningyu Zhang
Luoqiu Li
Xiang Chen
Shumin Deng
Zhen Bi
Chuanqi Tan
Fei Huang
Huajun Chen
VLM
158
180
0
30 Aug 2021
CSDS: A Fine-Grained Chinese Dataset for Customer Service Dialogue
  Summarization
CSDS: A Fine-Grained Chinese Dataset for Customer Service Dialogue Summarization
Haitao Lin
Liqun Ma
Junnan Zhu
Lu Xiang
Yu Zhou
Jiajun Zhang
Chengqing Zong
118
47
0
30 Aug 2021
Neuron-level Interpretation of Deep NLP Models: A Survey
Neuron-level Interpretation of Deep NLP Models: A Survey
Hassan Sajjad
Nadir Durrani
Fahim Dalvi
MILMAI4CE
133
85
0
30 Aug 2021
Factual Consistency Evaluation for Text Summarization via Counterfactual
  Estimation
Factual Consistency Evaluation for Text Summarization via Counterfactual Estimation
Yuexiang Xie
Fei Sun
Yang Deng
Yaliang Li
Bolin Ding
HILM
115
54
0
30 Aug 2021
GeoVectors: A Linked Open Corpus of OpenStreetMap Embeddings on World
  Scale
GeoVectors: A Linked Open Corpus of OpenStreetMap Embeddings on World Scale
Nicolas Tempelmeier
Simon Gottschalk
Elena Demidova
71
15
0
30 Aug 2021
ASR-GLUE: A New Multi-task Benchmark for ASR-Robust Natural Language
  Understanding
ASR-GLUE: A New Multi-task Benchmark for ASR-Robust Natural Language Understanding
Lingyun Feng
Jianwei Yu
Deng Cai
Songxiang Liu
Haitao Zheng
Yan Wang
ELM
186
14
0
30 Aug 2021
A Battle of Network Structures: An Empirical Study of CNN, Transformer,
  and MLP
A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP
Yucheng Zhao
Guangting Wang
Chuanxin Tang
Chong Luo
Wenjun Zeng
Zhengjun Zha
93
69
0
30 Aug 2021
LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text
  Understanding and Generation
LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text Understanding and Generation
Jian Guan
Zhuoer Feng
Yamei Chen
Ru He
Xiaoxi Mao
Changjie Fan
Minlie Huang
120
33
0
30 Aug 2021
Selective Differential Privacy for Language Modeling
Selective Differential Privacy for Language Modeling
Weiyan Shi
Aiqi Cui
Evan Li
R. Jia
Zhou Yu
94
73
0
30 Aug 2021
RetroGAN: A Cyclic Post-Specialization System for Improving
  Out-of-Knowledge and Rare Word Representations
RetroGAN: A Cyclic Post-Specialization System for Improving Out-of-Knowledge and Rare Word Representations
Pedro Colon-Hernandez
Yida Xin
H. Lieberman
Catherine Havasi
C. Breazeal
Peter Chin
GANKELM
54
3
0
30 Aug 2021
Fine-Grained Chemical Entity Typing with Multimodal Knowledge
  Representation
Fine-Grained Chemical Entity Typing with Multimodal Knowledge Representation
Chenkai Sun
Weijian Li
Jinfeng Xiao
Nikolaus Nova Parulian
ChengXiang Zhai
Heng Ji
88
4
0
29 Aug 2021
A Multimodal Framework for Video Ads Understanding
A Multimodal Framework for Video Ads Understanding
Zejia Weng
Lingjiang Meng
Rui Wang
Zuxuan Wu
Yu-Gang Jiang
52
1
0
29 Aug 2021
Span Fine-tuning for Pre-trained Language Models
Span Fine-tuning for Pre-trained Language Models
Rongzhou Bao
Zhuosheng Zhang
Hai Zhao
68
2
0
29 Aug 2021
Analyzing and Mitigating Interference in Neural Architecture Search
Analyzing and Mitigating Interference in Neural Architecture Search
Jin Xu
Xu Tan
Kaitao Song
Renqian Luo
Yichong Leng
Tao Qin
Tie-Yan Liu
Jian Li
MoMe
99
29
0
29 Aug 2021
Interpretable Propaganda Detection in News Articles
Interpretable Propaganda Detection in News Articles
Seunghak Yu
Giovanni Da San Martino
Mitra Mohtarami
James R. Glass
Preslav Nakov
72
31
0
29 Aug 2021
Searching for an Effective Defender: Benchmarking Defense against
  Adversarial Word Substitution
Searching for an Effective Defender: Benchmarking Defense against Adversarial Word Substitution
Zongyi Li
Jianhan Xu
Jiehang Zeng
Linyang Li
Xiaoqing Zheng
Qi Zhang
Kai-Wei Chang
Cho-Jui Hsieh
AAML
57
74
0
29 Aug 2021
Sentence Structure and Word Relationship Modeling for Emphasis Selection
Sentence Structure and Word Relationship Modeling for Emphasis Selection
Haoran Yang
Wai Lam
40
0
0
29 Aug 2021
Oh My Mistake!: Toward Realistic Dialogue State Tracking including
  Turnback Utterances
Oh My Mistake!: Toward Realistic Dialogue State Tracking including Turnback Utterances
Takyoung Kim
Yukyung Lee
Hoonsang Yoon
Pilsung Kang
Junseong Bang
Misuk Kim
94
3
0
28 Aug 2021
GroupFormer: Group Activity Recognition with Clustered Spatial-Temporal
  Transformer
GroupFormer: Group Activity Recognition with Clustered Spatial-Temporal Transformer
Shuaicheng Li
Qianggang Cao
Lingbo Liu
Kunlin Yang
Shinan Liu
Jun Hou
Shuai Yi
ViT
99
106
0
28 Aug 2021
WALNUT: A Benchmark on Semi-weakly Supervised Learning for Natural
  Language Understanding
WALNUT: A Benchmark on Semi-weakly Supervised Learning for Natural Language Understanding
Guoqing Zheng
Giannis Karamanolakis
Kai Shu
Ahmed Hassan Awadallah
SSL
66
1
0
28 Aug 2021
Mitigation of Diachronic Bias in Fake News Detection Dataset
Mitigation of Diachronic Bias in Fake News Detection Dataset
Taichi Murayama
Shoko Wakamiya
Eiji Aramaki
AI4CE
104
13
0
28 Aug 2021
Smoothing Dialogue States for Open Conversational Machine Reading
Smoothing Dialogue States for Open Conversational Machine Reading
Zhuosheng Zhang
Siru Ouyang
Hai Zhao
Masao Utiyama
Eiichiro Sumita
86
6
0
28 Aug 2021
Self-training Improves Pre-training for Few-shot Learning in
  Task-oriented Dialog Systems
Self-training Improves Pre-training for Few-shot Learning in Task-oriented Dialog Systems
Fei Mi
Wanhao Zhou
Feng Cai
Lingjing Kong
Minlie Huang
Boi Faltings
105
32
0
28 Aug 2021
On the Significance of Question Encoder Sequence Model in the
  Out-of-Distribution Performance in Visual Question Answering
On the Significance of Question Encoder Sequence Model in the Out-of-Distribution Performance in Visual Question Answering
K. Gouthaman
Anurag Mittal
CML
89
0
0
28 Aug 2021
AMMASurv: Asymmetrical Multi-Modal Attention for Accurate Survival
  Analysis with Whole Slide Images and Gene Expression Data
AMMASurv: Asymmetrical Multi-Modal Attention for Accurate Survival Analysis with Whole Slide Images and Gene Expression Data
Ruoqi Wang
Ziwang Huang
Haitao Wang
Hejun Wu
115
7
0
28 Aug 2021
TweetBLM: A Hate Speech Dataset and Analysis of Black Lives
  Matter-related Microblogs on Twitter
TweetBLM: A Hate Speech Dataset and Analysis of Black Lives Matter-related Microblogs on Twitter
Sumit Kumar
Raj Ratn Pranesh
65
18
0
27 Aug 2021
Learning Inner-Group Relations on Point Clouds
Learning Inner-Group Relations on Point Clouds
Haoxi Ran
Wei Zhuo
Jing Liu
Li Lu
3DPC
111
61
0
27 Aug 2021
Code-switched inspired losses for generic spoken dialog representations
Code-switched inspired losses for generic spoken dialog representations
E. Chapuis
Pierre Colombo
Matthieu Labeau
Chloe Clave
179
12
0
27 Aug 2021
Automatic Text Evaluation through the Lens of Wasserstein Barycenters
Automatic Text Evaluation through the Lens of Wasserstein Barycenters
Pierre Colombo
Guillaume Staerman
Chloé Clavel
Pablo Piantanida
212
41
0
27 Aug 2021
Train Short, Test Long: Attention with Linear Biases Enables Input
  Length Extrapolation
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
355
779
0
27 Aug 2021
DomiKnowS: A Library for Integration of Symbolic Domain Knowledge in
  Deep Learning
DomiKnowS: A Library for Integration of Symbolic Domain Knowledge in Deep Learning
Hossein Rajaby Faghihi
Quan Guo
Andrzej Uszok
Aliakbar Nafar
Elaheh Raisi
Parisa Kordjamshidi
AI4CE
66
18
0
27 Aug 2021
Contrastive Mixup: Self- and Semi-Supervised learning for Tabular Domain
Contrastive Mixup: Self- and Semi-Supervised learning for Tabular Domain
Sajad Darabi
Shayan Fazeli
Ali Pazoki
S. Sankararaman
Majid Sarrafzadeh
SSL
72
29
0
27 Aug 2021
Evaluating the Robustness of Neural Language Models to Input
  Perturbations
Evaluating the Robustness of Neural Language Models to Input Perturbations
M. Moradi
Matthias Samwald
AAML
103
102
0
27 Aug 2021
Exploring the Capacity of a Large-scale Masked Language Model to
  Recognize Grammatical Errors
Exploring the Capacity of a Large-scale Masked Language Model to Recognize Grammatical Errors
Ryo Nagata
Manabu Kimura
Kazuaki Hanawa
37
5
0
27 Aug 2021
A Partition Filter Network for Joint Entity and Relation Extraction
A Partition Filter Network for Joint Entity and Relation Extraction
Zhiheng Yan
Chong Zhang
Jinlan Fu
Qi Zhang
Zhongyu Wei
120
140
0
27 Aug 2021
Translation Error Detection as Rationale Extraction
Translation Error Detection as Rationale Extraction
M. Fomicheva
Lucia Specia
Nikolaos Aletras
107
25
0
27 Aug 2021
Query-Focused Extractive Summarisation for Finding Ideal Answers to
  Biomedical and COVID-19 Questions
Query-Focused Extractive Summarisation for Finding Ideal Answers to Biomedical and COVID-19 Questions
Diego Mollá Aliod
Urvashi Khanna
Dima Galat
Vincent Nguyen
Maciej Rybiński
RALM
86
2
0
27 Aug 2021
Offensive Language Identification in Low-resourced Code-mixed Dravidian
  languages using Pseudo-labeling
Offensive Language Identification in Low-resourced Code-mixed Dravidian languages using Pseudo-labeling
Adeep Hande
Karthik Puranik
Konthala Yasaswini
R. Priyadharshini
Sajeetha Thavareesan
Anbukkarasi Sampath
Kogilavani Shanmugavadivel
D. Thenmozhi
Bharathi Raja Chakravarthi
93
29
0
27 Aug 2021
Lyra: A Benchmark for Turducken-Style Code Generation
Lyra: A Benchmark for Turducken-Style Code Generation
Qingyuan Liang
Zeyu Sun
Qihao Zhu
Wenjie Zhang
Lian Yu
Yingfei Xiong
Lu Zhang
64
13
0
27 Aug 2021
Reinforcement Learning-powered Semantic Communication via Semantic
  Similarity
Reinforcement Learning-powered Semantic Communication via Semantic Similarity
Kun Lu
Rongpeng Li
Xianfu Chen
Zhifeng Zhao
Honggang Zhang
55
54
0
27 Aug 2021
Previous
123...311312313...476477478
Next