ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLM
    SSL
    SSeg
ArXivPDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

36 / 19,786 papers shown
Title
A Hitchhiker's Guide On Distributed Training of Deep Neural Networks
A Hitchhiker's Guide On Distributed Training of Deep Neural Networks
K. Chahal
Manraj Singh Grover
Kuntal Dey
3DH
OOD
26
53
0
28 Oct 2018
Variational Semi-supervised Aspect-term Sentiment Analysis via
  Transformer
Variational Semi-supervised Aspect-term Sentiment Analysis via Transformer
Xingyi Cheng
Weidi Xu
Taifeng Wang
Wei Chu
18
23
0
24 Oct 2018
Testing the Generalization Power of Neural Network Models Across NLI
  Benchmarks
Testing the Generalization Power of Neural Network Models Across NLI Benchmarks
Aarne Talman
S. Chatzikyriakidis
ELM
30
48
0
23 Oct 2018
Compositional Coding Capsule Network with K-Means Routing for Text
  Classification
Compositional Coding Capsule Network with K-Means Routing for Text Classification
Hao Ren
Hong-wei Lu
63
53
0
22 Oct 2018
pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence
  Inference
pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference
Mandar Joshi
Eunsol Choi
Omer Levy
Daniel S. Weld
Luke Zettlemoyer
CoGe
27
47
0
20 Oct 2018
Large-scale Hierarchical Alignment for Data-driven Text Rewriting
Large-scale Hierarchical Alignment for Data-driven Text Rewriting
Nikola I. Nikolov
Richard H. R. Hahnloser
46
7
0
18 Oct 2018
O2A: One-shot Observational learning with Action vectors
O2A: One-shot Observational learning with Action vectors
Leo Pauly
Wisdom C. Agboh
David C. Hogg
R. Fuentes
57
9
0
17 Oct 2018
A Span-Extraction Dataset for Chinese Machine Reading Comprehension
A Span-Extraction Dataset for Chinese Machine Reading Comprehension
Yiming Cui
Ting Liu
Wanxiang Che
Li Xiao
Zhipeng Chen
Wentao Ma
Shijin Wang
Guoping Hu
46
183
0
17 Oct 2018
Multi-Source Cross-Lingual Model Transfer: Learning What to Share
Multi-Source Cross-Lingual Model Transfer: Learning What to Share
Xilun Chen
Ahmed Hassan Awadallah
Hany Hassan
Wei Wang
Claire Cardie
36
20
0
08 Oct 2018
Neural Approaches to Conversational AI
Neural Approaches to Conversational AI
Jianfeng Gao
Michel Galley
Lihong Li
61
672
0
21 Sep 2018
Multi-task Learning with Sample Re-weighting for Machine Reading
  Comprehension
Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension
Yichong Xu
Xiaodong Liu
Yelong Shen
Jingjing Liu
Jianfeng Gao
41
51
0
18 Sep 2018
RumourEval 2019: Determining Rumour Veracity and Support for Rumours
RumourEval 2019: Determining Rumour Veracity and Support for Rumours
G. Gorrell
Kalina Bontcheva
Leon Derczynski
E. Kochkina
Maria Liakata
A. Zubiaga
37
216
0
18 Sep 2018
Explicit Contextual Semantics for Text Comprehension
Explicit Contextual Semantics for Text Comprehension
Zhuosheng Zhang
Yuwei Wu
Z. Li
Hai Zhao
31
29
0
08 Sep 2018
Exploiting Invertible Decoders for Unsupervised Sentence Representation
  Learning
Exploiting Invertible Decoders for Unsupervised Sentence Representation Learning
Shuai Tang
V. D. Sa
SSL
19
1
0
08 Sep 2018
Texar: A Modularized, Versatile, and Extensible Toolkit for Text
  Generation
Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation
Zhiting Hu
Haoran Shi
Bowen Tan
Wentao Wang
Zichao Yang
...
Zhengzhong Liu
Xiaodan Liang
Wangrong Zhu
Devendra Singh Sachan
Eric Xing
VLM
44
56
0
04 Sep 2018
WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive
  Meaning Representations
WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations
Mohammad Taher Pilehvar
Jose Camacho-Collados
21
470
0
28 Aug 2018
The Influence of Down-Sampling Strategies on SVD Word Embedding
  Stability
The Influence of Down-Sampling Strategies on SVD Word Embedding Stability
Johannes Hellrich
B. Kampe
U. Hahn
27
10
0
21 Aug 2018
Like a Baby: Visually Situated Neural Language Acquisition
Like a Baby: Visually Situated Neural Language Acquisition
Alexander Ororbia
A. Mali
Mary Alexandria Kelly
David Reitter
31
4
0
29 May 2018
Explainable Recommendation: A Survey and New Perspectives
Explainable Recommendation: A Survey and New Perspectives
Yongfeng Zhang
Xu Chen
XAI
LRM
52
868
0
30 Apr 2018
Stochastic Answer Networks for Natural Language Inference
Stochastic Answer Networks for Natural Language Inference
Xiaodong Liu
Kevin Duh
Jianfeng Gao
BDL
21
45
0
21 Apr 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
321
7,044
0
20 Apr 2018
Interact and Decide: Medley of Sub-Attention Networks for Effective
  Group Recommendation
Interact and Decide: Medley of Sub-Attention Networks for Effective Group Recommendation
Lucas Vinh Tran
T. Pham
Yi Tay
Yiding Liu
Gao Cong
Xiaoli Li
27
93
0
12 Apr 2018
Clinical Concept Embeddings Learned from Massive Sources of Multimodal
  Medical Data
Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data
Andrew L. Beam
Benjamin Kompa
A. Schmaltz
Inbar Fried
G. Weber
N. Palmer
Xu Shi
Tianxi Cai
I. Kohane
24
177
0
04 Apr 2018
The Geometry of Culture: Analyzing Meaning through Word Embeddings
The Geometry of Culture: Analyzing Meaning through Word Embeddings
Austin C. Kozlowski
Matt Taddy
James A. Evans
35
381
0
25 Mar 2018
SparCML: High-Performance Sparse Communication for Machine Learning
SparCML: High-Performance Sparse Communication for Machine Learning
Cédric Renggli
Saleh Ashkboos
Mehdi Aghagolzadeh
Dan Alistarh
Torsten Hoefler
34
126
0
22 Feb 2018
Natural Language Processing: State of The Art, Current Trends and
  Challenges
Natural Language Processing: State of The Art, Current Trends and Challenges
Diksha Khurana
Aditya Koli
Kiran Khatter
Sukhdev Singh
25
1,033
0
17 Aug 2017
Simple and Effective Dimensionality Reduction for Word Embeddings
Simple and Effective Dimensionality Reduction for Word Embeddings
Vikas Raunak
27
101
0
11 Aug 2017
Recent Trends in Deep Learning Based Natural Language Processing
Recent Trends in Deep Learning Based Natural Language Processing
Tom Young
Devamanyu Hazarika
Soujanya Poria
Min Zhang
47
2,830
0
09 Aug 2017
A Survey Of Cross-lingual Word Embedding Models
A Survey Of Cross-lingual Word Embedding Models
Sebastian Ruder
Ivan Vulić
Anders Søgaard
50
528
0
15 Jun 2017
Jointly Learning Sentence Embeddings and Syntax with Unsupervised
  Tree-LSTMs
Jointly Learning Sentence Embeddings and Syntax with Unsupervised Tree-LSTMs
Jean Maillard
S. Clark
Dani Yogatama
34
87
0
25 May 2017
Evolving Deep Neural Networks
Evolving Deep Neural Networks
Risto Miikkulainen
J. Liang
Elliot Meyerson
Aditya Rawal
Daniel Fink
...
B. Raju
Hormoz Shahrzad
Arshak Navruzyan
Nigel P. Duffy
Babak Hodjat
42
885
0
01 Mar 2017
Symbolic, Distributed and Distributional Representations for Natural
  Language Processing in the Era of Deep Learning: a Survey
Symbolic, Distributed and Distributional Representations for Natural Language Processing in the Era of Deep Learning: a Survey
L. Ferrone
Fabio Massimo Zanzotto
39
37
0
02 Feb 2017
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Zhiwen Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
727
6,756
0
26 Sep 2016
A Decomposable Attention Model for Natural Language Inference
A Decomposable Attention Model for Natural Language Inference
Ankur P. Parikh
Oscar Täckström
Dipanjan Das
Jakob Uszkoreit
223
1,369
0
06 Jun 2016
Quantifying the probable approximation error of probabilistic inference
  programs
Quantifying the probable approximation error of probabilistic inference programs
Marco F. Cusumano-Towner
Vikash K. Mansinghka
46
7
0
31 May 2016
Impact of Power System Partitioning on the Efficiency of Distributed
  Multi-Step Optimization
Impact of Power System Partitioning on the Efficiency of Distributed Multi-Step Optimization
Dongliang Chen
A. Bucchiarone
Zhihan Lv
28
12
0
31 May 2016
Previous
123...394395396