ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLM
    SSL
    SSeg
ArXivPDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 18,001 papers shown
Title
Analyzing the Structure of Attention in a Transformer Language Model
Analyzing the Structure of Attention in a Transformer Language Model
Jesse Vig
Yonatan Belinkov
30
357
0
07 Jun 2019
From Caesar Cipher to Unsupervised Learning: A New Method for Classifier
  Parameter Estimation
From Caesar Cipher to Unsupervised Learning: A New Method for Classifier Parameter Estimation
Yu Liu
Li Deng
Jianshu Chen
C. Chen
SSL
26
0
0
06 Jun 2019
Conversing by Reading: Contentful Neural Conversation with On-demand
  Machine Reading
Conversing by Reading: Contentful Neural Conversation with On-demand Machine Reading
Lianhui Qin
Michel Galley
Chris Brockett
Xiaodong Liu
Xiang Gao
W. Dolan
Yejin Choi
Jianfeng Gao
26
109
0
06 Jun 2019
Visualizing and Measuring the Geometry of BERT
Visualizing and Measuring the Geometry of BERT
Andy Coenen
Emily Reif
Ann Yuan
Been Kim
Adam Pearce
F. Viégas
Martin Wattenberg
MILM
43
415
0
06 Jun 2019
Cross-Lingual Syntactic Transfer through Unsupervised Adaptation of
  Invertible Projections
Cross-Lingual Syntactic Transfer through Unsupervised Adaptation of Invertible Projections
Junxian He
Zhisong Zhang
Taylor Berg-Kirkpatrick
Graham Neubig
33
21
0
06 Jun 2019
Unsupervised Pivot Translation for Distant Languages
Unsupervised Pivot Translation for Distant Languages
Yichong Leng
Xu Tan
Tao Qin
Xiang-Yang Li
Tie-Yan Liu
33
30
0
06 Jun 2019
Extracting Symptoms and their Status from Clinical Conversations
Extracting Symptoms and their Status from Clinical Conversations
Nan Du
Kai Chen
Anjuli Kannan
Linh Tran
Yuhui Chen
Izhak Shafran
20
68
0
05 Jun 2019
Large-Scale Multi-Label Text Classification on EU Legislation
Large-Scale Multi-Label Text Classification on EU Legislation
Ilias Chalkidis
Manos Fergadiotis
Prodromos Malakasiotis
Ion Androutsopoulos
AILaw
19
213
0
05 Jun 2019
From Balustrades to Pierre Vinken: Looking for Syntax in Transformer
  Self-Attentions
From Balustrades to Pierre Vinken: Looking for Syntax in Transformer Self-Attentions
David Marecek
Rudolf Rosa
28
52
0
05 Jun 2019
The Secrets of Machine Learning: Ten Things You Wish You Had Known
  Earlier to be More Effective at Data Analysis
The Secrets of Machine Learning: Ten Things You Wish You Had Known Earlier to be More Effective at Data Analysis
Cynthia Rudin
David Carlson
HAI
30
34
0
04 Jun 2019
KERMIT: Generative Insertion-Based Modeling for Sequences
KERMIT: Generative Insertion-Based Modeling for Sequences
William Chan
Nikita Kitaev
Kelvin Guu
Mitchell Stern
Jakob Uszkoreit
VLM
23
65
0
04 Jun 2019
Sequence Tagging with Contextual and Non-Contextual Subword
  Representations: A Multilingual Evaluation
Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation
Benjamin Heinzerling
Michael Strube
13
35
0
04 Jun 2019
How multilingual is Multilingual BERT?
How multilingual is Multilingual BERT?
Telmo Pires
Eva Schlinger
Dan Garrette
LRM
VLM
95
1,373
0
04 Jun 2019
Converse Attention Knowledge Transfer for Low-Resource Named Entity
  Recognition
Converse Attention Knowledge Transfer for Low-Resource Named Entity Recognition
Shengfei Lyu
Linghao Sun
Huixiong Yi
Yong-jin Liu
Huanhuan Chen
Steven C. H. Hoi
21
0
0
04 Jun 2019
Detecting Local Insights from Global Labels: Supervised & Zero-Shot
  Sequence Labeling via a Convolutional Decomposition
Detecting Local Insights from Global Labels: Supervised & Zero-Shot Sequence Labeling via a Convolutional Decomposition
A. Schmaltz
27
8
0
04 Jun 2019
Episodic Memory in Lifelong Language Learning
Episodic Memory in Lifelong Language Learning
Cyprien de Masson dÁutume
Sebastian Ruder
Lingpeng Kong
Dani Yogatama
CLL
KELM
34
281
0
03 Jun 2019
Learning Representations by Maximizing Mutual Information Across Views
Learning Representations by Maximizing Mutual Information Across Views
Philip Bachman
R. Devon Hjelm
William Buchwalter
SSL
96
1,457
0
03 Jun 2019
Masked Non-Autoregressive Image Captioning
Masked Non-Autoregressive Image Captioning
Junlong Gao
Xi Meng
Shiqi Wang
Xia Li
Shanshe Wang
Siwei Ma
Wen Gao
19
36
0
03 Jun 2019
BAYHENN: Combining Bayesian Deep Learning and Homomorphic Encryption for
  Secure DNN Inference
BAYHENN: Combining Bayesian Deep Learning and Homomorphic Encryption for Secure DNN Inference
Peichen Xie
Bingzhe Wu
Guangyu Sun
BDL
FedML
13
33
0
03 Jun 2019
Efficient 8-Bit Quantization of Transformer Neural Machine Language
  Translation Model
Efficient 8-Bit Quantization of Transformer Neural Machine Language Translation Model
Aishwarya Bhandare
Vamsi Sripathi
Deepthi Karkada
Vivek V. Menon
Sun Choi
Kushal Datta
V. Saletore
MQ
27
130
0
03 Jun 2019
A Survey of Natural Language Generation Techniques with a Focus on
  Dialogue Systems - Past, Present and Future Directions
A Survey of Natural Language Generation Techniques with a Focus on Dialogue Systems - Past, Present and Future Directions
Sashank Santhanam
Samira Shaikh
3DV
31
52
0
02 Jun 2019
Pretraining Methods for Dialog Context Representation Learning
Pretraining Methods for Dialog Context Representation Learning
Shikib Mehri
E. Razumovskaia
Tiancheng Zhao
M. Eskénazi
22
84
0
02 Jun 2019
Adversarial Generation and Encoding of Nested Texts
Adversarial Generation and Encoding of Nested Texts
A. Rozental
GAN
19
0
0
01 Jun 2019
Scoring Sentence Singletons and Pairs for Abstractive Summarization
Scoring Sentence Singletons and Pairs for Abstractive Summarization
Logan Lebanoff
Kaiqiang Song
Franck Dernoncourt
Doo Soon Kim
Seokhwan Kim
W. Chang
Fei Liu
CVBM
30
103
0
31 May 2019
Do Human Rationales Improve Machine Explanations?
Do Human Rationales Improve Machine Explanations?
Julia Strout
Ye Zhang
Raymond J. Mooney
19
57
0
31 May 2019
Investigating an Effective Character-level Embedding in Korean Sentence
  Classification
Investigating an Effective Character-level Embedding in Korean Sentence Classification
Won Ik Cho
Seokhwan Kim
N. Kim
28
8
0
31 May 2019
MultiQA: An Empirical Investigation of Generalization and Transfer in
  Reading Comprehension
MultiQA: An Empirical Investigation of Generalization and Transfer in Reading Comprehension
Alon Talmor
Jonathan Berant
20
172
0
31 May 2019
Fine-Grained Spoiler Detection from Large-Scale Review Corpora
Fine-Grained Spoiler Detection from Large-Scale Review Corpora
Mengting Wan
Rishabh Misra
Ndapandula Nakashole
Julian McAuley
9
130
0
31 May 2019
Rewarding Smatch: Transition-Based AMR Parsing with Reinforcement
  Learning
Rewarding Smatch: Transition-Based AMR Parsing with Reinforcement Learning
Tahira Naseem
Abhishek Shah
Hui Wan
Radu Florian
Salim Roukos
Miguel Ballesteros
25
59
0
31 May 2019
A Lightweight Recurrent Network for Sequence Modeling
A Lightweight Recurrent Network for Sequence Modeling
Biao Zhang
Rico Sennrich
27
7
0
30 May 2019
Unbabel's Submission to the WMT2019 APE Shared Task: BERT-based
  Encoder-Decoder for Automatic Post-Editing
Unbabel's Submission to the WMT2019 APE Shared Task: BERT-based Encoder-Decoder for Automatic Post-Editing
António Vilarinho Lopes
M. Amin Farajian
Gonçalo M. Correia
Jonay Trénous
André F. T. Martins
33
35
0
30 May 2019
Semantically Conditioned Dialog Response Generation via Hierarchical
  Disentangled Self-Attention
Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention
Wenhu Chen
Jianshu Chen
Pengda Qin
Xifeng Yan
William Yang Wang
28
129
0
30 May 2019
A Simple but Effective Method to Incorporate Multi-turn Context with
  BERT for Conversational Machine Comprehension
A Simple but Effective Method to Incorporate Multi-turn Context with BERT for Conversational Machine Comprehension
Yasuhito Ohsugi
Itsumi Saito
Kyosuke Nishida
Hisako Asano
J. Tomita
33
43
0
30 May 2019
A Generalized Framework of Sequence Generation with Application to
  Undirected Sequence Models
A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models
Elman Mansimov
Alex Jinpeng Wang
Sean Welleck
Kyunghyun Cho
AIMat
28
46
0
29 May 2019
Unsupervised Paraphrasing without Translation
Unsupervised Paraphrasing without Translation
Aurko Roy
David Grangier
BDL
LRM
11
61
0
29 May 2019
Adapting Text Embeddings for Causal Inference
Adapting Text Embeddings for Causal Inference
Victor Veitch
Dhanya Sridhar
David M. Blei
CML
17
21
0
29 May 2019
Defending Against Neural Fake News
Defending Against Neural Fake News
Rowan Zellers
Ari Holtzman
Hannah Rashkin
Yonatan Bisk
Ali Farhadi
Franziska Roesner
Yejin Choi
AAML
55
1,000
0
29 May 2019
Interpreting and improving natural-language processing (in machines)
  with natural language-processing (in the brain)
Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain)
Mariya Toneva
Leila Wehbe
MILM
AI4CE
42
220
0
28 May 2019
Combating Adversarial Misspellings with Robust Word Recognition
Combating Adversarial Misspellings with Robust Word Recognition
Danish Pruthi
Bhuwan Dhingra
Zachary Chase Lipton
25
300
0
27 May 2019
STAR-GCN: Stacked and Reconstructed Graph Convolutional Networks for
  Recommender Systems
STAR-GCN: Stacked and Reconstructed Graph Convolutional Networks for Recommender Systems
Jiani Zhang
Xingjian Shi
Shenglin Zhao
Irwin King
29
225
0
27 May 2019
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
Guillaume Jaume
H. K. Ekenel
Jean-Philippe Thiran
143
357
0
27 May 2019
Levenshtein Transformer
Levenshtein Transformer
Jiatao Gu
Changhan Wang
Jake Zhao
49
359
0
27 May 2019
AI-GAs: AI-generating algorithms, an alternate paradigm for producing
  general artificial intelligence
AI-GAs: AI-generating algorithms, an alternate paradigm for producing general artificial intelligence
Jeff Clune
17
116
0
27 May 2019
Where's My Head? Definition, Dataset and Models for Numeric Fused-Heads
  Identification and Resolution
Where's My Head? Definition, Dataset and Models for Numeric Fused-Heads Identification and Resolution
Yanai Elazar
Yoav Goldberg
19
23
0
26 May 2019
TIGS: An Inference Algorithm for Text Infilling with Gradient Search
TIGS: An Inference Algorithm for Text Infilling with Gradient Search
Dayiheng Liu
Jie Fu
Pengfei Liu
Jiancheng Lv
DiffM
21
27
0
26 May 2019
Hashing based Answer Selection
Hashing based Answer Selection
Dong Xu
Wu-Jun Li
19
6
0
26 May 2019
Stochastic Shared Embeddings: Data-driven Regularization of Embedding
  Layers
Stochastic Shared Embeddings: Data-driven Regularization of Embedding Layers
Liwei Wu
Shuqing Li
Cho-Jui Hsieh
James Sharpnack
21
31
0
25 May 2019
Human vs. Muppet: A Conservative Estimate of Human Performance on the
  GLUE Benchmark
Human vs. Muppet: A Conservative Estimate of Human Performance on the GLUE Benchmark
Nikita Nangia
Samuel R. Bowman
ELM
ALM
34
75
0
24 May 2019
Discrete Flows: Invertible Generative Models of Discrete Data
Discrete Flows: Invertible Generative Models of Discrete Data
Dustin Tran
Keyon Vafa
Kumar Krishna Agrawal
Laurent Dinh
Ben Poole
DRL
24
114
0
24 May 2019
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
Christopher Clark
Kenton Lee
Ming-Wei Chang
Tom Kwiatkowski
Michael Collins
Kristina Toutanova
96
1,413
0
24 May 2019
Previous
123...355356357...359360361
Next