ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLM
    SSL
    SSeg
ArXivPDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 18,335 papers shown
Title
Self-Supervised Graph Transformer on Large-Scale Molecular Data
Self-Supervised Graph Transformer on Large-Scale Molecular Data
Yu Rong
Yatao Bian
Tingyang Xu
Wei-yang Xie
Ying Wei
Wenbing Huang
Junzhou Huang
AI4CE
24
25
0
18 Jun 2020
STEAM: Self-Supervised Taxonomy Expansion with Mini-Paths
STEAM: Self-Supervised Taxonomy Expansion with Mini-Paths
Yue Yu
Yinghao Li
Jiaming Shen
Haoyang Feng
Jimeng Sun
Chao Zhang
26
59
0
18 Jun 2020
Simple and Principled Uncertainty Estimation with Deterministic Deep
  Learning via Distance Awareness
Simple and Principled Uncertainty Estimation with Deterministic Deep Learning via Distance Awareness
Jeremiah Zhe Liu
Zi Lin
Shreyas Padhy
Dustin Tran
Tania Bedrax-Weiss
Balaji Lakshminarayanan
UQCV
BDL
41
437
0
17 Jun 2020
GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training
GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training
J. Qiu
Qibin Chen
Yuxiao Dong
Jing Zhang
Hongxia Yang
Ming Ding
Kuansan Wang
Jie Tang
SSL
64
935
0
17 Jun 2020
Contrastive Learning for Weakly Supervised Phrase Grounding
Contrastive Learning for Weakly Supervised Phrase Grounding
Tanmay Gupta
Arash Vahdat
Gal Chechik
Xiaodong Yang
Jan Kautz
Derek Hoiem
ObjD
SSL
44
141
0
17 Jun 2020
Learning Visual Commonsense for Robust Scene Graph Generation
Learning Visual Commonsense for Robust Scene Graph Generation
Alireza Zareian
Zhecan Wang
Haoxuan You
Shih-Fu Chang
29
312
0
17 Jun 2020
Dynamic Tensor Rematerialization
Dynamic Tensor Rematerialization
Marisa Kirisame
Steven Lyubomirsky
Altan Haan
Jennifer Brennan
Mike He
Jared Roesch
Tianqi Chen
Zachary Tatlock
29
93
0
17 Jun 2020
Cross-lingual Retrieval for Iterative Self-Supervised Training
Cross-lingual Retrieval for Iterative Self-Supervised Training
C. Tran
Y. Tang
Xian Li
Jiatao Gu
RALM
33
73
0
16 Jun 2020
Memory-Efficient Pipeline-Parallel DNN Training
Memory-Efficient Pipeline-Parallel DNN Training
Deepak Narayanan
Amar Phanishayee
Kaiyu Shi
Xie Chen
Matei A. Zaharia
MoE
45
212
0
16 Jun 2020
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
Ashvin Nair
Abhishek Gupta
Murtaza Dalal
Sergey Levine
OffRL
OnRL
46
592
0
16 Jun 2020
70 years of machine learning in geoscience in review
70 years of machine learning in geoscience in review
Jesper Sören Dramsch
VLM
AI4CE
34
160
0
16 Jun 2020
On the Computational Power of Transformers and its Implications in
  Sequence Modeling
On the Computational Power of Transformers and its Implications in Sequence Modeling
S. Bhattamishra
Arkil Patel
Navin Goyal
33
66
0
16 Jun 2020
Communicative need modulates competition in language change
Communicative need modulates competition in language change
Andres Karjus
Richard A. Blythe
S. Kirby
Kenny Smith
29
11
0
16 Jun 2020
Results of the seventh edition of the BioASQ Challenge
Results of the seventh edition of the BioASQ Challenge
A. Nentidis
K. Bougiatiotis
Anastasia Krithara
George Giannakopoulos
29
62
0
16 Jun 2020
PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized
  Embedding Models
PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized Embedding Models
Eyal Ben-David
Carmel Rabinovitz
Roi Reichart
SSL
68
61
0
16 Jun 2020
Scalable Cross Lingual Pivots to Model Pronoun Gender for Translation
Scalable Cross Lingual Pivots to Model Pronoun Gender for Translation
Kellie Webster
Emily Pitler
22
5
0
16 Jun 2020
Minimum Width for Universal Approximation
Minimum Width for Universal Approximation
Sejun Park
Chulhee Yun
Jaeho Lee
Jinwoo Shin
35
122
0
16 Jun 2020
Preserving Dynamic Attention for Long-Term Spatial-Temporal Prediction
Preserving Dynamic Attention for Long-Term Spatial-Temporal Prediction
Haoxing Lin
Rufan Bai
Weijia Jia
Xinyu Yang
Yongjian You
HAI
AI4TS
30
64
0
16 Jun 2020
COMPOSE: Cross-Modal Pseudo-Siamese Network for Patient Trial Matching
COMPOSE: Cross-Modal Pseudo-Siamese Network for Patient Trial Matching
Junyi Gao
Cao Xiao
Lucas Glass
Jimeng Sun
25
64
0
15 Jun 2020
Document Classification for COVID-19 Literature
Document Classification for COVID-19 Literature
Bernal Jiménez Gutiérrez
Juncheng Zeng
Dongdong Zhang
Ping Zhang
Yu-Chuan Su
OOD
19
32
0
15 Jun 2020
To Pretrain or Not to Pretrain: Examining the Benefits of Pretraining on
  Resource Rich Tasks
To Pretrain or Not to Pretrain: Examining the Benefits of Pretraining on Resource Rich Tasks
Sinong Wang
Madian Khabsa
Hao Ma
18
26
0
15 Jun 2020
Self-supervised Learning: Generative or Contrastive
Self-supervised Learning: Generative or Contrastive
Xiao Liu
Fanjin Zhang
Zhenyu Hou
Zhaoyu Wang
Li Mian
Jing Zhang
Jie Tang
SSL
54
1,588
0
15 Jun 2020
FinBERT: A Pretrained Language Model for Financial Communications
FinBERT: A Pretrained Language Model for Financial Communications
Yi Yang
Mark Christopher Siy Uy
Allen H Huang
AIFin
AI4CE
28
231
0
15 Jun 2020
Neural Execution Engines: Learning to Execute Subroutines
Neural Execution Engines: Learning to Execute Subroutines
Yujun Yan
Kevin Swersky
Danai Koutra
Parthasarathy Ranganathan
Milad Hashemi
NAI
24
40
0
15 Jun 2020
Improving Post Training Neural Quantization: Layer-wise Calibration and
  Integer Programming
Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming
Itay Hubara
Yury Nahshan
Y. Hanani
Ron Banner
Daniel Soudry
MQ
35
123
0
14 Jun 2020
FinEst BERT and CroSloEngual BERT: less is more in multilingual models
FinEst BERT and CroSloEngual BERT: less is more in multilingual models
Matej Ulvcar
Marko Robnik-Šikonja
19
48
0
14 Jun 2020
Continual General Chunking Problem and SyncMap
Continual General Chunking Problem and SyncMap
Danilo Vasconcellos Vargas
Toshitake Asabuki
29
7
0
14 Jun 2020
Transferring Monolingual Model to Low-Resource Language: The Case of
  Tigrinya
Transferring Monolingual Model to Low-Resource Language: The Case of Tigrinya
Abrhalei Tela
Abraham Woubie
Ville Hautamaki
39
12
0
13 Jun 2020
Guided Transformer: Leveraging Multiple External Sources for
  Representation Learning in Conversational Search
Guided Transformer: Leveraging Multiple External Sources for Representation Learning in Conversational Search
Helia Hashemi
Hamed Zamani
W. Bruce Croft
26
62
0
13 Jun 2020
How to Avoid Being Eaten by a Grue: Structured Exploration Strategies
  for Textual Worlds
How to Avoid Being Eaten by a Grue: Structured Exploration Strategies for Textual Worlds
Prithviraj Ammanabrolu
Ethan Tien
Matthew J. Hausknecht
Mark O. Riedl
LLMAG
29
50
0
12 Jun 2020
Comparing Natural Language Processing Techniques for Alzheimer's
  Dementia Prediction in Spontaneous Speech
Comparing Natural Language Processing Techniques for Alzheimer's Dementia Prediction in Spontaneous Speech
Thomas Searle
Zina M. Ibrahim
Richard J. B. Dobson
14
46
0
12 Jun 2020
SemEval-2020 Task 12: Multilingual Offensive Language Identification in
  Social Media (OffensEval 2020)
SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)
Marcos Zampieri
Preslav Nakov
Sara Rosenthal
Pepa Atanasova
Georgi Karadzhov
Hamdy Mubarak
Leon Derczynski
Zeses Pitenis
cCaugri cColtekin
30
483
0
12 Jun 2020
NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language
  Processing
NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language Processing
Nikita Klyuchnikov
I. Trofimov
Ekaterina Artemova
Mikhail Salnikov
M. Fedorov
Evgeny Burnaev
VLM
21
101
0
12 Jun 2020
Towards Robust Pattern Recognition: A Review
Towards Robust Pattern Recognition: A Review
Xu-Yao Zhang
Cheng-Lin Liu
C. Suen
OOD
HAI
26
103
0
12 Jun 2020
Does Unsupervised Architecture Representation Learning Help Neural
  Architecture Search?
Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?
Shen Yan
Yu Zheng
Wei Ao
Xiao Zeng
Mi Zhang
SSL
AI4CE
37
100
0
12 Jun 2020
Rethinking Pre-training and Self-training
Rethinking Pre-training and Self-training
Barret Zoph
Golnaz Ghiasi
Nayeon Lee
Huayu Chen
Hanxiao Liu
E. D. Cubuk
Quoc V. Le
SSeg
48
646
0
11 Jun 2020
FastPitch: Parallel Text-to-speech with Pitch Prediction
FastPitch: Parallel Text-to-speech with Pitch Prediction
Adrian Lañcucki
44
333
0
11 Jun 2020
VirTex: Learning Visual Representations from Textual Annotations
VirTex: Learning Visual Representations from Textual Annotations
Karan Desai
Justin Johnson
SSL
VLM
30
433
0
11 Jun 2020
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason
  Over Implicit Knowledge
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge
Alon Talmor
Oyvind Tafjord
Peter Clark
Yoav Goldberg
Jonathan Berant
ReLM
LRM
36
39
0
11 Jun 2020
ScoreGAN: A Fraud Review Detector based on Multi Task Learning of
  Regulated GAN with Data Augmentation
ScoreGAN: A Fraud Review Detector based on Multi Task Learning of Regulated GAN with Data Augmentation
Saeedreza Shehnepoor
R. Togneri
Wei Liu
Bennamoun
30
4
0
11 Jun 2020
CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot
  Cross-Lingual NLP
CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot Cross-Lingual NLP
Libo Qin
Minheng Ni
Yue Zhang
Wanxiang Che
45
149
0
11 Jun 2020
Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine
  Translation Evaluation Metrics
Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics
Nitika Mathur
Tim Baldwin
Trevor Cohn
6
245
0
11 Jun 2020
Augmenting Data for Sarcasm Detection with Unlabeled Conversation
  Context
Augmenting Data for Sarcasm Detection with Unlabeled Conversation Context
Hankyol Lee
Youngjae Yu
Gunhee Kim
27
23
0
11 Jun 2020
A Monolingual Approach to Contextualized Word Embeddings for
  Mid-Resource Languages
A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages
Pedro Ortiz Suarez
Laurent Romary
Benoît Sagot
28
227
0
11 Jun 2020
Large-Scale Adversarial Training for Vision-and-Language Representation
  Learning
Large-Scale Adversarial Training for Vision-and-Language Representation Learning
Zhe Gan
Yen-Chun Chen
Linjie Li
Chen Zhu
Yu Cheng
Jingjing Liu
ObjD
VLM
35
489
0
11 Jun 2020
Report from the NSF Future Directions Workshop, Toward User-Oriented
  Agents: Research Directions and Challenges
Report from the NSF Future Directions Workshop, Toward User-Oriented Agents: Research Directions and Challenges
M. Eskénazi
Tiancheng Zhao
LLMAG
AI4TS
AI4CE
36
9
0
10 Jun 2020
Revisiting Few-sample BERT Fine-tuning
Revisiting Few-sample BERT Fine-tuning
Tianyi Zhang
Felix Wu
Arzoo Katiyar
Kilian Q. Weinberger
Yoav Artzi
41
442
0
10 Jun 2020
AMEIR: Automatic Behavior Modeling, Interaction Exploration and MLP
  Investigation in the Recommender System
AMEIR: Automatic Behavior Modeling, Interaction Exploration and MLP Investigation in the Recommender System
Pengyu Zhao
Kecheng Xiao
Yuanxing Zhang
Kaigui Bian
Wei Yan
21
16
0
10 Jun 2020
Dataset Condensation with Gradient Matching
Dataset Condensation with Gradient Matching
Bo Zhao
Konda Reddy Mopuri
Hakan Bilen
DD
41
479
0
10 Jun 2020
Embed2Detect: Temporally Clustered Embedded Words for Event Detection in
  Social Media
Embed2Detect: Temporally Clustered Embedded Words for Event Detection in Social Media
Hansi Hettiarachchi
Mariam Adedoyin-Olowe
Jagdev Bhogal
M. Gaber
21
33
0
10 Jun 2020
Previous
123...335336337...365366367
Next