ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLM
    SSL
    SSeg
ArXivPDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 18,335 papers shown
Title
Hardware Acceleration of Sparse and Irregular Tensor Computations of ML
  Models: A Survey and Insights
Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights
Shail Dave
Riyadh Baghdadi
Tony Nowatzki
Sasikanth Avancha
Aviral Shrivastava
Baoxin Li
64
82
0
02 Jul 2020
Facts as Experts: Adaptable and Interpretable Neural Memory over
  Symbolic Knowledge
Facts as Experts: Adaptable and Interpretable Neural Memory over Symbolic Knowledge
Pat Verga
Haitian Sun
Livio Baldini Soares
William W. Cohen
KELM
35
50
0
02 Jul 2020
Computing Conceptual Distances between Breast Cancer Screening
  Guidelines: An Implementation of a Near-Peer Epistemic Model of Medical
  Disagreement
Computing Conceptual Distances between Breast Cancer Screening Guidelines: An Implementation of a Near-Peer Epistemic Model of Medical Disagreement
Hossein Hematialam
Luciana D. Garbayo
Seethalakshmi Gopalakrishnan
Wlodek Zadrozny
16
1
0
01 Jul 2020
Measuring Robustness to Natural Distribution Shifts in Image
  Classification
Measuring Robustness to Natural Distribution Shifts in Image Classification
Rohan Taori
Achal Dave
Vaishaal Shankar
Nicholas Carlini
Benjamin Recht
Ludwig Schmidt
OOD
50
537
0
01 Jul 2020
Unbiased Loss Functions for Extreme Classification With Missing Labels
Unbiased Loss Functions for Extreme Classification With Missing Labels
Erik Schultheis
Mohammadreza Qaraei
Priyanshu Gupta
Rohit Babbar
21
6
0
01 Jul 2020
SemEval-2020 Task 4: Commonsense Validation and Explanation
SemEval-2020 Task 4: Commonsense Validation and Explanation
Cunxiang Wang
Shuailong Liang
Yili Jin
Yilong Wang
Xiao-Dan Zhu
Yue Zhang
LRM
25
98
0
01 Jul 2020
Transferability of Natural Language Inference to Biomedical Question
  Answering
Transferability of Natural Language Inference to Biomedical Question Answering
Minbyul Jeong
Mujeen Sung
Gangwoo Kim
Donghyeon Kim
Wonjin Yoon
J. Yoo
Jaewoo Kang
19
38
0
01 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers
Data Movement Is All You Need: A Case Study on Optimizing Transformers
A. Ivanov
Nikoli Dryden
Tal Ben-Nun
Shigang Li
Torsten Hoefler
36
131
0
30 Jun 2020
ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through
  Scene Graph
ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph
Fei Yu
Jiji Tang
Weichong Yin
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
31
376
0
30 Jun 2020
PLATO-2: Towards Building an Open-Domain Chatbot via Curriculum Learning
PLATO-2: Towards Building an Open-Domain Chatbot via Curriculum Learning
Siqi Bao
H. He
Fan Wang
Hua Wu
Haifeng Wang
Wenquan Wu
Zhen Guo
Zhibin Liu
Xinchao Xu
30
137
0
30 Jun 2020
GShard: Scaling Giant Models with Conditional Computation and Automatic
  Sharding
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Dmitry Lepikhin
HyoukJoong Lee
Yuanzhong Xu
Dehao Chen
Orhan Firat
Yanping Huang
M. Krikun
Noam M. Shazeer
Zhehuai Chen
MoE
43
1,118
0
30 Jun 2020
Classification of cancer pathology reports: a large-scale comparative
  study
Classification of cancer pathology reports: a large-scale comparative study
S. Martina
L. Ventura
P. Frasconi
27
11
0
29 Jun 2020
Multi-Head Attention: Collaborate Instead of Concatenate
Multi-Head Attention: Collaborate Instead of Concatenate
Jean-Baptiste Cordonnier
Andreas Loukas
Martin Jaggi
6
108
0
29 Jun 2020
Learning Sparse Prototypes for Text Generation
Learning Sparse Prototypes for Text Generation
Junxian He
Taylor Berg-Kirkpatrick
Graham Neubig
27
23
0
29 Jun 2020
Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for
  Improved Generalization
Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for Improved Generalization
Sang Michael Xie
Tengyu Ma
Percy Liang
40
13
0
29 Jun 2020
Natural Backdoor Attack on Text Data
Natural Backdoor Attack on Text Data
Lichao Sun
SILM
19
39
0
29 Jun 2020
Improving Sequence Tagging for Vietnamese Text Using Transformer-based
  Neural Models
Improving Sequence Tagging for Vietnamese Text Using Transformer-based Neural Models
Viet The Bui
Oanh T. K. Tran
Hong Phuong Le
25
38
0
29 Jun 2020
Answering Questions on COVID-19 in Real-Time
Answering Questions on COVID-19 in Real-Time
Jinhyuk Lee
Sean S. Yi
Minbyul Jeong
Mujeen Sung
Wonjin Yoon
Yonghwa Choi
Miyoung Ko
Jaewoo Kang
21
43
0
29 Jun 2020
Open Domain Suggestion Mining Leveraging Fine-Grained Analysis
Open Domain Suggestion Mining Leveraging Fine-Grained Analysis
Shreya Singal
Tanishq Goel
Shivang Chopra
S. Dahiya
14
3
0
27 Jun 2020
GPT-GNN: Generative Pre-Training of Graph Neural Networks
GPT-GNN: Generative Pre-Training of Graph Neural Networks
Ziniu Hu
Yuxiao Dong
Kuansan Wang
Kai-Wei Chang
Yizhou Sun
SSL
AI4CE
18
549
0
27 Jun 2020
Video-Grounded Dialogues with Pretrained Generation Language Models
Video-Grounded Dialogues with Pretrained Generation Language Models
Hung Le
Guosheng Lin
34
28
0
27 Jun 2020
BERTology Meets Biology: Interpreting Attention in Protein Language
  Models
BERTology Meets Biology: Interpreting Attention in Protein Language Models
Jesse Vig
Ali Madani
Lav Varshney
Caiming Xiong
R. Socher
Nazneen Rajani
34
289
0
26 Jun 2020
Pre-training via Paraphrasing
Pre-training via Paraphrasing
M. Lewis
Marjan Ghazvininejad
Gargi Ghosh
Armen Aghajanyan
Sida I. Wang
Luke Zettlemoyer
AIMat
30
159
0
26 Jun 2020
Evaluation of Text Generation: A Survey
Evaluation of Text Generation: A Survey
Asli Celikyilmaz
Elizabeth Clark
Jianfeng Gao
ELM
LM&MA
44
378
0
26 Jun 2020
Fast, Accurate, and Simple Models for Tabular Data via Augmented
  Distillation
Fast, Accurate, and Simple Models for Tabular Data via Augmented Distillation
Rasool Fakoor
Jonas W. Mueller
Nick Erickson
Pratik Chaudhari
Alex Smola
26
54
0
25 Jun 2020
Explainable CNN-attention Networks (C-Attention Network) for Automated
  Detection of Alzheimer's Disease
Explainable CNN-attention Networks (C-Attention Network) for Automated Detection of Alzheimer's Disease
Ning Wang
Mingxuan Chen
K. P. Subbalakshmi
25
22
0
25 Jun 2020
Subpopulation Data Poisoning Attacks
Subpopulation Data Poisoning Attacks
Matthew Jagielski
Giorgio Severi
Niklas Pousette Harger
Alina Oprea
AAML
SILM
24
114
0
24 Jun 2020
Unsupervised Cross-lingual Representation Learning for Speech
  Recognition
Unsupervised Cross-lingual Representation Learning for Speech Recognition
Alexis Conneau
Alexei Baevski
R. Collobert
Abdel-rahman Mohamed
Michael Auli
SSL
70
755
0
24 Jun 2020
Efficient Constituency Parsing by Pointing
Efficient Constituency Parsing by Pointing
Thanh-Tung Nguyen
Xuan-Phi Nguyen
Shafiq Joty
Xiaoli Li
27
11
0
24 Jun 2020
Principal Component Networks: Parameter Reduction Early in Training
Principal Component Networks: Parameter Reduction Early in Training
R. Waleffe
Theodoros Rekatsinas
3DPC
19
9
0
23 Jun 2020
A Deep Learning Pipeline for Patient Diagnosis Prediction Using
  Electronic Health Records
A Deep Learning Pipeline for Patient Diagnosis Prediction Using Electronic Health Records
Leopold Franz
Yash Raj Shrestha
B. Paudel
OOD
31
25
0
23 Jun 2020
Can you tell? SSNet -- a Sagittal Stratum-inspired Neural Network
  Framework for Sentiment Analysis
Can you tell? SSNet -- a Sagittal Stratum-inspired Neural Network Framework for Sentiment Analysis
Apostol T. Vassilev
Munawar Hasan
Honglan Jin
35
1
0
23 Jun 2020
Direct Feedback Alignment Scales to Modern Deep Learning Tasks and
  Architectures
Direct Feedback Alignment Scales to Modern Deep Learning Tasks and Architectures
Julien Launay
Iacopo Poli
Franccois Boniface
Florent Krzakala
41
63
0
23 Jun 2020
Hermes Attack: Steal DNN Models with Lossless Inference Accuracy
Hermes Attack: Steal DNN Models with Lossless Inference Accuracy
Yuankun Zhu
Yueqiang Cheng
Husheng Zhou
Yantao Lu
MIACV
AAML
39
99
0
23 Jun 2020
LAMP: Large Deep Nets with Automated Model Parallelism for Image
  Segmentation
LAMP: Large Deep Nets with Automated Model Parallelism for Image Segmentation
Wentao Zhu
Can Zhao
Wenqi Li
H. Roth
Ziyue Xu
Daguang Xu
3DV
32
18
0
22 Jun 2020
The Depth-to-Width Interplay in Self-Attention
The Depth-to-Width Interplay in Self-Attention
Yoav Levine
Noam Wies
Or Sharir
Hofit Bata
Amnon Shashua
30
45
0
22 Jun 2020
Open-Domain Conversational Agents: Current Progress, Open Problems, and
  Future Directions
Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions
Stephen Roller
Y-Lan Boureau
Jason Weston
Antoine Bordes
Emily Dinan
...
Kurt Shuster
Eric Michael Smith
Arthur Szlam
Jack Urbanek
Mary Williamson
LLMAG
AI4CE
33
51
0
22 Jun 2020
What shapes feature representations? Exploring datasets, architectures,
  and training
What shapes feature representations? Exploring datasets, architectures, and training
Katherine L. Hermann
Andrew Kyle Lampinen
OOD
23
154
0
22 Jun 2020
Revisiting Loss Modelling for Unstructured Pruning
Revisiting Loss Modelling for Unstructured Pruning
César Laurent
Camille Ballas
Thomas George
Nicolas Ballas
Pascal Vincent
32
14
0
22 Jun 2020
Self-Supervised Representations Improve End-to-End Speech Translation
Self-Supervised Representations Improve End-to-End Speech Translation
Anne Wu
Changhan Wang
J. Pino
Jiatao Gu
SSL
27
40
0
22 Jun 2020
Students Need More Attention: BERT-based AttentionModel for Small Data
  with Application to AutomaticPatient Message Triage
Students Need More Attention: BERT-based AttentionModel for Small Data with Application to AutomaticPatient Message Triage
Shijing Si
Rui Wang
Jedrek Wosik
Hao Zhang
D. Dov
Guoyin Wang
Ricardo Henao
Lawrence Carin
33
26
0
22 Jun 2020
Hippo: Taming Hyper-parameter Optimization of Deep Learning with Stage
  Trees
Hippo: Taming Hyper-parameter Optimization of Deep Learning with Stage Trees
Ahnjae Shin
Do Yoon Kim
Joo Seong Jeong
Byung-Gon Chun
28
4
0
22 Jun 2020
MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of
  Gradients
MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of Gradients
Chenfei Zhu
Yu Cheng
Zhe Gan
Furong Huang
Jingjing Liu
Tom Goldstein
ODL
35
2
0
21 Jun 2020
A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics and
  Benchmark Datasets
A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics and Benchmark Datasets
Chengchang Zeng
Shaobo Li
Qin Li
Jie Hu
Jianjun Hu
34
101
0
21 Jun 2020
Defense against Adversarial Attacks in NLP via Dirichlet Neighborhood
  Ensemble
Defense against Adversarial Attacks in NLP via Dirichlet Neighborhood Ensemble
Yi Zhou
Xiaoqing Zheng
Cho-Jui Hsieh
Kai-Wei Chang
Xuanjing Huang
SILM
39
48
0
20 Jun 2020
Sarcasm Detection in Tweets with BERT and GloVe Embeddings
Sarcasm Detection in Tweets with BERT and GloVe Embeddings
A. Khatri
P. Pranav
M. Anandkumar
9
42
0
20 Jun 2020
Learning to Prove from Synthetic Theorems
Learning to Prove from Synthetic Theorems
Eser Aygun
Zafarali Ahmed
Ankit Anand
Vlad Firoiu
Xavier Glorot
Laurent Orseau
Doina Precup
Shibl Mourad
NAI
20
20
0
19 Jun 2020
Differentiable Language Model Adversarial Attacks on Categorical
  Sequence Classifiers
Differentiable Language Model Adversarial Attacks on Categorical Sequence Classifiers
I. Fursov
A. Zaytsev
Nikita Klyuchnikov
A. Kravchenko
Evgeny Burnaev
AAML
SILM
31
5
0
19 Jun 2020
Dataset for Automatic Summarization of Russian News
Dataset for Automatic Summarization of Russian News
I. Gusev
19
24
0
19 Jun 2020
SenWave: Monitoring the Global Sentiments under the COVID-19 Pandemic
SenWave: Monitoring the Global Sentiments under the COVID-19 Pandemic
Qiang Yang
Hind Alamro
Somayah Albaradei
Adil Salhi
Xiaoting Lv
...
Wei Wang
T. Gojobori
C. Duarte
Xin Gao
Xiangliang Zhang
6
35
0
18 Jun 2020
Previous
123...334335336...365366367
Next