ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 23,545 papers shown
Title
Cross-Lingual Syntactic Transfer through Unsupervised Adaptation of
  Invertible Projections
Cross-Lingual Syntactic Transfer through Unsupervised Adaptation of Invertible Projections
Junxian He
Zhisong Zhang
Taylor Berg-Kirkpatrick
Graham Neubig
75
21
0
06 Jun 2019
Generating Question-Answer Hierarchies
Generating Question-Answer Hierarchies
Kalpesh Krishna
Mohit Iyyer
83
39
0
06 Jun 2019
Unsupervised Pivot Translation for Distant Languages
Unsupervised Pivot Translation for Distant Languages
Yichong Leng
Xu Tan
Tao Qin
Xiang-Yang Li
Tie-Yan Liu
51
30
0
06 Jun 2019
GCDT: A Global Context Enhanced Deep Transition Architecture for
  Sequence Labeling
GCDT: A Global Context Enhanced Deep Transition Architecture for Sequence Labeling
Yanjun Liu
Fandong Meng
Jinchao Zhang
Jinan Xu
Jinan Xu
Jie Zhou
69
90
0
06 Jun 2019
Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification,
  and Local Computations
Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification, and Local Computations
Debraj Basu
Deepesh Data
C. Karakuş
Suhas Diggavi
MQ
76
408
0
06 Jun 2019
Explain Yourself! Leveraging Language Models for Commonsense Reasoning
Explain Yourself! Leveraging Language Models for Commonsense Reasoning
Nazneen Rajani
Bryan McCann
Caiming Xiong
R. Socher
ReLMLRM
133
567
0
06 Jun 2019
Energy and Policy Considerations for Deep Learning in NLP
Energy and Policy Considerations for Deep Learning in NLP
Emma Strubell
Ananya Ganesh
Andrew McCallum
88
2,668
0
05 Jun 2019
Variational Pretraining for Semi-supervised Text Classification
Variational Pretraining for Semi-supervised Text Classification
Suchin Gururangan
T. Dang
Dallas Card
Noah A. Smith
VLM
66
112
0
05 Jun 2019
Extracting Symptoms and their Status from Clinical Conversations
Extracting Symptoms and their Status from Clinical Conversations
Nan Du
Kai Chen
Anjuli Kannan
Linh Tran
Yuhui Chen
Izhak Shafran
67
68
0
05 Jun 2019
Learning to Rank for Plausible Plausibility
Learning to Rank for Plausible Plausibility
Zhongyang Li
Tongfei Chen
Benjamin Van Durme
59
22
0
05 Jun 2019
Neural Legal Judgment Prediction in English
Neural Legal Judgment Prediction in English
Ilias Chalkidis
Ion Androutsopoulos
Nikolaos Aletras
AILawELM
190
342
0
05 Jun 2019
Large-Scale Multi-Label Text Classification on EU Legislation
Large-Scale Multi-Label Text Classification on EU Legislation
Ilias Chalkidis
Manos Fergadiotis
Prodromos Malakasiotis
Ion Androutsopoulos
AILaw
64
217
0
05 Jun 2019
From Balustrades to Pierre Vinken: Looking for Syntax in Transformer
  Self-Attentions
From Balustrades to Pierre Vinken: Looking for Syntax in Transformer Self-Attentions
David Marecek
Rudolf Rosa
59
52
0
05 Jun 2019
Baby steps towards few-shot learning with multiple semantics
Baby steps towards few-shot learning with multiple semantics
Eli Schwartz
Leonid Karlinsky
Rogerio Feris
Raja Giryes
A. Bronstein
VLM
117
107
0
05 Jun 2019
Towards Multimodal Sarcasm Detection (An _Obviously_ Perfect Paper)
Towards Multimodal Sarcasm Detection (An _Obviously_ Perfect Paper)
Santiago Castro
Devamanyu Hazarika
Verónica Pérez-Rosas
Roger Zimmermann
Rada Mihalcea
Soujanya Poria
76
259
0
05 Jun 2019
Learning Deep Transformer Models for Machine Translation
Learning Deep Transformer Models for Machine Translation
Qiang Wang
Bei Li
Tong Xiao
Jingbo Zhu
Changliang Li
Derek F. Wong
Lidia S. Chao
108
673
0
05 Jun 2019
Entity-Centric Contextual Affective Analysis
Entity-Centric Contextual Affective Analysis
Anjalie Field
Yulia Tsvetkov
91
30
0
05 Jun 2019
The Unreasonable Effectiveness of Transformer Language Models in
  Grammatical Error Correction
The Unreasonable Effectiveness of Transformer Language Models in Grammatical Error Correction
Dimitrios Alikaniotis
Vipul Raheja
72
23
0
04 Jun 2019
Open Sesame: Getting Inside BERT's Linguistic Knowledge
Open Sesame: Getting Inside BERT's Linguistic Knowledge
Yongjie Lin
Y. Tan
Robert Frank
89
287
0
04 Jun 2019
Towards Lossless Encoding of Sentences
Towards Lossless Encoding of Sentences
Gabriele Prato
Mathieu Duchesneau
A. Chandar
Alain Tapp
59
2
0
04 Jun 2019
The Secrets of Machine Learning: Ten Things You Wish You Had Known
  Earlier to be More Effective at Data Analysis
The Secrets of Machine Learning: Ten Things You Wish You Had Known Earlier to be More Effective at Data Analysis
Cynthia Rudin
David Carlson
HAI
128
34
0
04 Jun 2019
KERMIT: Generative Insertion-Based Modeling for Sequences
KERMIT: Generative Insertion-Based Modeling for Sequences
William Chan
Nikita Kitaev
Kelvin Guu
Mitchell Stern
Jakob Uszkoreit
VLM
96
65
0
04 Jun 2019
Sequence Tagging with Contextual and Non-Contextual Subword
  Representations: A Multilingual Evaluation
Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation
Benjamin Heinzerling
Michael Strube
53
36
0
04 Jun 2019
Training Neural Response Selection for Task-Oriented Dialogue Systems
Training Neural Response Selection for Task-Oriented Dialogue Systems
Matthew Henderson
Ivan Vulić
D. Gerz
I. Casanueva
Paweł Budzianowski
Sam Coope
Georgios P. Spithourakis
Tsung-Hsien Wen
N. Mrksic
Pei-hao Su
54
111
0
04 Jun 2019
Blackbox meets blackbox: Representational Similarity and Stability
  Analysis of Neural Language Models and Brains
Blackbox meets blackbox: Representational Similarity and Stability Analysis of Neural Language Models and Brains
Samira Abnar
Lisa Beinborn
Rochelle Choenni
Willem H. Zuidema
89
76
0
04 Jun 2019
How multilingual is Multilingual BERT?
How multilingual is Multilingual BERT?
Telmo Pires
Eva Schlinger
Dan Garrette
LRMVLM
266
1,418
0
04 Jun 2019
A Cross-Sentence Latent Variable Model for Semi-Supervised Text Sequence
  Matching
A Cross-Sentence Latent Variable Model for Semi-Supervised Text Sequence Matching
Jihun Choi
Taeuk Kim
Sang-goo Lee
BDL
77
6
0
04 Jun 2019
Converse Attention Knowledge Transfer for Low-Resource Named Entity
  Recognition
Converse Attention Knowledge Transfer for Low-Resource Named Entity Recognition
Shengfei Lyu
Linghao Sun
Huixiong Yi
Yong Liu
Huanhuan Chen
Steven C. H. Hoi
80
0
0
04 Jun 2019
A Review of Automated Speech and Language Features for Assessment of
  Cognitive and Thought Disorders
A Review of Automated Speech and Language Features for Assessment of Cognitive and Thought Disorders
Rohit Voleti
J. Liss
Visar Berisha
77
72
0
04 Jun 2019
Detecting Local Insights from Global Labels: Supervised & Zero-Shot
  Sequence Labeling via a Convolutional Decomposition
Detecting Local Insights from Global Labels: Supervised & Zero-Shot Sequence Labeling via a Convolutional Decomposition
A. Schmaltz
60
8
0
04 Jun 2019
Episodic Memory in Lifelong Language Learning
Episodic Memory in Lifelong Language Learning
Cyprien de Masson dÁutume
Sebastian Ruder
Lingpeng Kong
Dani Yogatama
CLLKELM
150
293
0
03 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRLOnRL
156
1,070
0
03 Jun 2019
Learning Representations by Maximizing Mutual Information Across Views
Learning Representations by Maximizing Mutual Information Across Views
Philip Bachman
R. Devon Hjelm
William Buchwalter
SSL
230
1,482
0
03 Jun 2019
Gendered Ambiguous Pronouns Shared Task: Boosting Model Confidence by
  Evidence Pooling
Gendered Ambiguous Pronouns Shared Task: Boosting Model Confidence by Evidence Pooling
Sandeep Attree
36
14
0
03 Jun 2019
Gender-preserving Debiasing for Pre-trained Word Embeddings
Gender-preserving Debiasing for Pre-trained Word Embeddings
Masahiro Kaneko
Danushka Bollegala
FaML
72
131
0
03 Jun 2019
Masked Non-Autoregressive Image Captioning
Masked Non-Autoregressive Image Captioning
Junlong Gao
Xi Meng
Shiqi Wang
Xia Li
Shanshe Wang
Siwei Ma
Wen Gao
80
39
0
03 Jun 2019
Resolving Gendered Ambiguous Pronouns with BERT
Resolving Gendered Ambiguous Pronouns with BERT
Kellie Webster
Marta Recasens
Ken Krige
Vera Axelrod
Denis Logvinenko
Jason Baldridge
107
54
0
03 Jun 2019
BAYHENN: Combining Bayesian Deep Learning and Homomorphic Encryption for
  Secure DNN Inference
BAYHENN: Combining Bayesian Deep Learning and Homomorphic Encryption for Secure DNN Inference
Peichen Xie
Bingzhe Wu
Guangyu Sun
BDLFedML
56
33
0
03 Jun 2019
Assessing the Ability of Self-Attention Networks to Learn Word Order
Assessing the Ability of Self-Attention Networks to Learn Word Order
Baosong Yang
Longyue Wang
Derek F. Wong
Lidia S. Chao
Zhaopeng Tu
75
32
0
03 Jun 2019
Know More about Each Other: Evolving Dialogue Strategy via Compound
  Assessment
Know More about Each Other: Evolving Dialogue Strategy via Compound Assessment
Siqi Bao
H. He
Fan Wang
Rongzhong Lian
Hua Wu
LLMAGOffRL
74
18
0
03 Jun 2019
Efficient 8-Bit Quantization of Transformer Neural Machine Language
  Translation Model
Efficient 8-Bit Quantization of Transformer Neural Machine Language Translation Model
Aishwarya Bhandare
Vamsi Sripathi
Deepthi Karkada
Vivek V. Menon
Sun Choi
Kushal Datta
V. Saletore
MQ
95
132
0
03 Jun 2019
A Survey of Natural Language Generation Techniques with a Focus on
  Dialogue Systems - Past, Present and Future Directions
A Survey of Natural Language Generation Techniques with a Focus on Dialogue Systems - Past, Present and Future Directions
Sashank Santhanam
Samira Shaikh
3DV
79
52
0
02 Jun 2019
Pretraining Methods for Dialog Context Representation Learning
Pretraining Methods for Dialog Context Representation Learning
Shikib Mehri
E. Razumovskaia
Tiancheng Zhao
M. Eskénazi
117
84
0
02 Jun 2019
Does It Make Sense? And Why? A Pilot Study for Sense Making and
  Explanation
Does It Make Sense? And Why? A Pilot Study for Sense Making and Explanation
Cunxiang Wang
Shuailong Liang
Yue Zhang
Xiaonan Li
Tian Gao
LRM
90
111
0
02 Jun 2019
Pre-training of Graph Augmented Transformers for Medication
  Recommendation
Pre-training of Graph Augmented Transformers for Medication Recommendation
Junyuan Shang
Tengfei Ma
Cao Xiao
Jimeng Sun
94
291
0
02 Jun 2019
Latent Retrieval for Weakly Supervised Open Domain Question Answering
Latent Retrieval for Weakly Supervised Open Domain Question Answering
Kenton Lee
Ming-Wei Chang
Kristina Toutanova
RALM
123
1,020
0
01 Jun 2019
Multimodal Transformer for Unaligned Multimodal Language Sequences
Multimodal Transformer for Unaligned Multimodal Language Sequences
Yao-Hung Hubert Tsai
Shaojie Bai
Paul Pu Liang
J. Zico Kolter
Louis-Philippe Morency
Ruslan Salakhutdinov
105
1,319
0
01 Jun 2019
Efficient Adaptation of Pretrained Transformers for Abstractive
  Summarization
Efficient Adaptation of Pretrained Transformers for Abstractive Summarization
Andrew Hoang
Antoine Bosselut
Asli Celikyilmaz
Yejin Choi
76
41
0
01 Jun 2019
Scoring Sentence Singletons and Pairs for Abstractive Summarization
Scoring Sentence Singletons and Pairs for Abstractive Summarization
Logan Lebanoff
Kaiqiang Song
Franck Dernoncourt
Doo Soon Kim
Seokhwan Kim
W. Chang
Fei Liu
CVBM
126
106
0
31 May 2019
Pre-Training Graph Neural Networks for Generic Structural Feature
  Extraction
Pre-Training Graph Neural Networks for Generic Structural Feature Extraction
Ziniu Hu
Changjun Fan
Ting-Li Chen
Kai-Wei Chang
Yizhou Sun
68
44
0
31 May 2019
Previous
123...461462463...469470471
Next