ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 23,708 papers shown
Title
Centrality Meets Centroid: A Graph-based Approach for Unsupervised Document Summarization
Haopeng Zhang
Jiawei Zhang
48
0
0
29 Mar 2021
TFPose: Direct Human Pose Estimation with Transformers
TFPose: Direct Human Pose Estimation with Transformers
Wei Mao
Yongtao Ge
Chunhua Shen
Zhi Tian
Xinlong Wang
Zhibin Wang
ViT
102
89
0
29 Mar 2021
Whitening Sentence Representations for Better Semantics and Faster
  Retrieval
Whitening Sentence Representations for Better Semantics and Faster Retrieval
Jianlin Su
Jiarun Cao
Weijie Liu
Yangyiwen Ou
100
305
0
29 Mar 2021
One Network Fits All? Modular versus Monolithic Task Formulations in
  Neural Networks
One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks
Atish Agarwala
Abhimanyu Das
Brendan Juba
Rina Panigrahy
Vatsal Sharan
Xin Wang
Qiuyi Zhang
MoMe
54
11
0
29 Mar 2021
A More Fine-Grained Aspect-Sentiment-Opinion Triplet Extraction Task
A More Fine-Grained Aspect-Sentiment-Opinion Triplet Extraction Task
Yuncong Li
Fang Wang
Wei Zhang
Shengtao Zhong
Cunxiang Yin
Yancheng He
76
21
0
29 Mar 2021
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
Ye Jia
Heiga Zen
Jonathan Shen
Yu Zhang
Yonghui Wu
SSL
108
84
0
28 Mar 2021
A Benchmark and Comprehensive Survey on Knowledge Graph Entity Alignment
  via Representation Learning
A Benchmark and Comprehensive Survey on Knowledge Graph Entity Alignment via Representation Learning
Rui Zhang
Bayu Distiawan Trisedy
Miao Li
Yong Jiang
Jianzhong Qi
AI4TS
65
72
0
28 Mar 2021
TransICD: Transformer Based Code-wise Attention Model for Explainable
  ICD Coding
TransICD: Transformer Based Code-wise Attention Model for Explainable ICD Coding
Biplob Biswas
Thai-Hoang Pham
Ping Zhang
89
29
0
28 Mar 2021
HiT: Hierarchical Transformer with Momentum Contrast for Video-Text
  Retrieval
HiT: Hierarchical Transformer with Momentum Contrast for Video-Text Retrieval
Song Liu
Haoqi Fan
Shengsheng Qian
Yiru Chen
Wenkui Ding
Zhongyuan Wang
119
147
0
28 Mar 2021
Accurate and Reliable Forecasting using Stochastic Differential
  Equations
Accurate and Reliable Forecasting using Stochastic Differential Equations
Peng Cui
Zhijie Deng
Wenbo Hu
Jun Zhu
UQCV
77
1
0
28 Mar 2021
Self-supervised Graph Neural Networks without explicit negative sampling
Self-supervised Graph Neural Networks without explicit negative sampling
Zekarias T. Kefato
Sarunas Girdzijauskas
SSL
113
44
0
27 Mar 2021
Automated Backend-Aware Post-Training Quantization
Automated Backend-Aware Post-Training Quantization
Ziheng Jiang
Animesh Jain
An Liu
Josh Fromm
Chengqian Ma
Tianqi Chen
Luis Ceze
MQ
84
2
0
27 Mar 2021
Machine Learning Meets Natural Language Processing -- The story so far
Machine Learning Meets Natural Language Processing -- The story so far
N. Galanis
P. Vafiadis
K.-G. Mirzaev
G. Papakostas
90
7
0
27 Mar 2021
Abuse is Contextual, What about NLP? The Role of Context in Abusive
  Language Annotation and Detection
Abuse is Contextual, What about NLP? The Role of Context in Abusive Language Annotation and Detection
Stefano Menini
Alessio Palmero Aprosio
Sara Tonelli
77
40
0
27 Mar 2021
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image
  Classification
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Chun-Fu Chen
Quanfu Fan
Yikang Shen
ViT
77
1,503
0
27 Mar 2021
Unsupervised Self-Training for Sentiment Analysis of Code-Switched Data
Unsupervised Self-Training for Sentiment Analysis of Code-Switched Data
Akshat Gupta
Sargam Menghani
Sai Krishna Rallabandi
A. Black
SSL
73
14
0
27 Mar 2021
Synthesis of Compositional Animations from Textual Descriptions
Synthesis of Compositional Animations from Textual Descriptions
Anindita Ghosh
N. Cheema
Cennet Oguz
Christian Theobalt
P. Slusallek
129
180
0
26 Mar 2021
A Practical Survey on Faster and Lighter Transformers
A Practical Survey on Faster and Lighter Transformers
Quentin Fournier
G. Caron
Daniel Aloise
142
105
0
26 Mar 2021
Dodrio: Exploring Transformer Models with Interactive Visualization
Dodrio: Exploring Transformer Models with Interactive Visualization
Zijie J. Wang
Robert Turko
Duen Horng Chau
88
36
0
26 Mar 2021
Understanding Robustness of Transformers for Image Classification
Understanding Robustness of Transformers for Image Classification
Srinadh Bhojanapalli
Ayan Chakrabarti
Daniel Glasner
Daliang Li
Thomas Unterthiner
Andreas Veit
ViT
139
392
0
26 Mar 2021
Unsupervised Document Embedding via Contrastive Augmentation
Unsupervised Document Embedding via Contrastive Augmentation
Dongsheng Luo
Wei Cheng
Jingchao Ni
Wenchao Yu
Xuchao Zhang
...
Yanchi Liu
Zhengzhang Chen
Dongjin Song
Haifeng Chen
Xiang Zhang
SSL
67
12
0
26 Mar 2021
On the hidden treasure of dialog in video question answering
On the hidden treasure of dialog in video question answering
Deniz Engin
Franccois Schnitzler
Ngoc Q. K. Duong
Yannis Avrithis
76
12
0
26 Mar 2021
Incorporating Connections Beyond Knowledge Embeddings: A Plug-and-Play
  Module to Enhance Commonsense Reasoning in Machine Reading Comprehension
Incorporating Connections Beyond Knowledge Embeddings: A Plug-and-Play Module to Enhance Commonsense Reasoning in Machine Reading Comprehension
Damai Dai
Hua Zheng
Zhifang Sui
Baobao Chang
KELMLRM
31
2
0
26 Mar 2021
Gated Transformer Networks for Multivariate Time Series Classification
Gated Transformer Networks for Multivariate Time Series Classification
Minghao Liu
Shengqi Ren
Siyuan Ma
Jiahui Jiao
Yizhou Chen
Zhiguang Wang
Wei Song
AI4TS
80
137
0
26 Mar 2021
BART based semantic correction for Mandarin automatic speech recognition
  system
BART based semantic correction for Mandarin automatic speech recognition system
Yun Zhao
Xuerui Yang
Jinchao Wang
Yongyu Gao
Chao Yan
Yuanfu Zhou
VLM
75
29
0
26 Mar 2021
ACRE: Abstract Causal REasoning Beyond Covariation
ACRE: Abstract Causal REasoning Beyond Covariation
Fangqiu Yi
Baoxiong Jia
Mark Edmonds
Song-Chun Zhu
Yixin Zhu
CML
132
48
0
26 Mar 2021
Describing and Localizing Multiple Changes with Transformers
Describing and Localizing Multiple Changes with Transformers
Yue Qiu
Shintaro Yamamoto
Kodai Nakashima
Ryota Suzuki
K. Iwata
Hirokatsu Kataoka
Y. Satoh
95
59
0
25 Mar 2021
Persistence Homology of TEDtalk: Do Sentence Embeddings Have a
  Topological Shape?
Persistence Homology of TEDtalk: Do Sentence Embeddings Have a Topological Shape?
Shouman Das
S. A. Haque
Md. Iftekhar Tanveer
34
5
0
25 Mar 2021
High-Fidelity Pluralistic Image Completion with Transformers
High-Fidelity Pluralistic Image Completion with Transformers
Bo Liu
Jingbo Zhang
Dongdong Chen
Jing Liao
ViT
86
238
0
25 Mar 2021
AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent
  Forecasting
AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting
Ye Yuan
Xinshuo Weng
Yanglan Ou
Kris Kitani
AI4TS
126
461
0
25 Mar 2021
Visual Grounding Strategies for Text-Only Natural Language Processing
Visual Grounding Strategies for Text-Only Natural Language Processing
Damien Sileo
50
8
0
25 Mar 2021
An Image is Worth 16x16 Words, What is a Video Worth?
An Image is Worth 16x16 Words, What is a Video Worth?
Gilad Sharir
Asaf Noy
Lihi Zelnik-Manor
ViT
104
125
0
25 Mar 2021
Equality before the Law: Legal Judgment Consistency Analysis for
  Fairness
Equality before the Law: Legal Judgment Consistency Analysis for Fairness
Yuzhong Wang
Chaojun Xiao
Shirong Ma
Haoxiang Zhong
Cunchao Tu
Tianyang Zhang
Zhiyuan Liu
Maosong Sun
AILaw
71
20
0
25 Mar 2021
Vectorization and Rasterization: Self-Supervised Learning for Sketch and
  Handwriting
Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting
A. Bhunia
Pinaki Nath Chowdhury
Yongxin Yang
Timothy M. Hospedales
Tao Xiang
Yi-Zhe Song
SSL
114
62
0
25 Mar 2021
HufuNet: Embedding the Left Piece as Watermark and Keeping the Right
  Piece for Ownership Verification in Deep Neural Networks
HufuNet: Embedding the Left Piece as Watermark and Keeping the Right Piece for Ownership Verification in Deep Neural Networks
Peizhuo Lv
Pan Li
Shengzhi Zhang
Kai Chen
Ruigang Liang
Yue Zhao
Yingjiu Li
AAML
48
5
0
25 Mar 2021
An Approach to Improve Robustness of NLP Systems against ASR Errors
An Approach to Improve Robustness of NLP Systems against ASR Errors
Tong Cui
Jinghui Xiao
Liangyou Li
Xin Jiang
Qun Liu
61
11
0
25 Mar 2021
Engineering an Intelligent Essay Scoring and Feedback System: An
  Experience Report
Engineering an Intelligent Essay Scoring and Feedback System: An Experience Report
A. Chadda
Kelly Song
Raman Chandrasekar
I. Gorton
19
2
0
25 Mar 2021
Improving Online Forums Summarization via Hierarchical Unified Deep
  Neural Network
Improving Online Forums Summarization via Hierarchical Unified Deep Neural Network
Sansiri Tarnpradab
Fereshteh Jafariakinabad
K. Hua
57
5
0
25 Mar 2021
BERT4SO: Neural Sentence Ordering by Fine-tuning BERT
BERT4SO: Neural Sentence Ordering by Fine-tuning BERT
Yutao Zhu
J. Nie
Kun Zhou
Shengchao Liu
Yabo Ling
Pan Du
173
4
0
25 Mar 2021
Approximating Instance-Dependent Noise via Instance-Confidence Embedding
Approximating Instance-Dependent Noise via Instance-Confidence Embedding
Yivan Zhang
Masashi Sugiyama
67
8
0
25 Mar 2021
Efficient Feature Transformations for Discriminative and Generative
  Continual Learning
Efficient Feature Transformations for Discriminative and Generative Continual Learning
Vinay Kumar Verma
Kevin J. Liang
Nikhil Mehta
Piyush Rai
Lawrence Carin
CLL
115
78
0
25 Mar 2021
Benchmarking Modern Named Entity Recognition Techniques for Free-text
  Health Record De-identification
Benchmarking Modern Named Entity Recognition Techniques for Free-text Health Record De-identification
A. Ahmed
A. Abbasi
Carsten Eickhoff
BDL
29
9
0
25 Mar 2021
DRANet: Disentangling Representation and Adaptation Networks for
  Unsupervised Cross-Domain Adaptation
DRANet: Disentangling Representation and Adaptation Networks for Unsupervised Cross-Domain Adaptation
Seunghun Lee
Sunghyun Cho
Sunghoon Im
DRL
106
59
0
24 Mar 2021
Vision Transformers for Dense Prediction
Vision Transformers for Dense Prediction
René Ranftl
Alexey Bochkovskiy
V. Koltun
ViTMDE
191
1,756
0
24 Mar 2021
Are Multilingual Models Effective in Code-Switching?
Are Multilingual Models Effective in Code-Switching?
Genta Indra Winata
Samuel Cahyawijaya
Zihan Liu
Zhaojiang Lin
Andrea Madotto
Pascale Fung
68
72
0
24 Mar 2021
When Word Embeddings Become Endangered
When Word Embeddings Become Endangered
Khalid Alnajjar
39
11
0
24 Mar 2021
FastMoE: A Fast Mixture-of-Expert Training System
FastMoE: A Fast Mixture-of-Expert Training System
Jiaao He
J. Qiu
Aohan Zeng
Zhilin Yang
Jidong Zhai
Jie Tang
ALMMoE
122
104
0
24 Mar 2021
Representing Numbers in NLP: a Survey and a Vision
Representing Numbers in NLP: a Survey and a Vision
Avijit Thawani
Jay Pujara
Pedro A. Szekely
Filip Ilievski
113
119
0
24 Mar 2021
Finetuning Pretrained Transformers into RNNs
Finetuning Pretrained Transformers into RNNs
Jungo Kasai
Hao Peng
Yizhe Zhang
Dani Yogatama
Gabriel Ilharco
Nikolaos Pappas
Yi Mao
Weizhu Chen
Noah A. Smith
126
67
0
24 Mar 2021
Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning
  Performance of GPT-2
Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning Performance of GPT-2
Gregor Betz
Kyle Richardson
Christian Voigt
ReLMLRM
101
31
0
24 Mar 2021
Previous
123...352353354...473474475
Next