ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLM
    SSL
    SSeg
ArXivPDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

35 / 18,335 papers shown
Title
Elastic CRFs for Open-ontology Slot Filling
Elastic CRFs for Open-ontology Slot Filling
Yinpei Dai
Yichi Zhang
Hong Liu
Zhijian Ou
Yanmeng Wang
Junlan Feng
32
2
0
04 Nov 2018
Learning to Rank Query Graphs for Complex Question Answering over
  Knowledge Graphs
Learning to Rank Query Graphs for Complex Question Answering over Knowledge Graphs
Gaurav Maheshwari
Priyansh Trivedi
Denis Lukovnikov
Nilesh Chakraborty
Asja Fischer
Jens Lehmann
GNN
18
72
0
02 Nov 2018
Sentence Encoders on STILTs: Supplementary Training on Intermediate
  Labeled-data Tasks
Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks
Jason Phang
Thibault Févry
Samuel R. Bowman
33
467
0
02 Nov 2018
CommonsenseQA: A Question Answering Challenge Targeting Commonsense
  Knowledge
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
Alon Talmor
Jonathan Herzig
Nicholas Lourie
Jonathan Berant
RALM
43
1,623
0
02 Nov 2018
On the Generation of Medical Question-Answer Pairs
On the Generation of Medical Question-Answer Pairs
Sheng Shen
Yaliang Li
Nan Du
X. Wu
Yusheng Xie
Shen Ge
Tao Yang
Kai Wang
Xin-Fang Liang
Wei Fan
MedIm
18
21
0
01 Nov 2018
Improving Machine Reading Comprehension with General Reading Strategies
Improving Machine Reading Comprehension with General Reading Strategies
Kai Sun
Dian Yu
Dong Yu
Claire Cardie
AI4CE
24
116
0
31 Oct 2018
A Hitchhiker's Guide On Distributed Training of Deep Neural Networks
A Hitchhiker's Guide On Distributed Training of Deep Neural Networks
K. Chahal
Manraj Singh Grover
Kuntal Dey
3DH
OOD
6
53
0
28 Oct 2018
Testing the Generalization Power of Neural Network Models Across NLI
  Benchmarks
Testing the Generalization Power of Neural Network Models Across NLI Benchmarks
Aarne Talman
S. Chatzikyriakidis
ELM
19
48
0
23 Oct 2018
Compositional Coding Capsule Network with K-Means Routing for Text
  Classification
Compositional Coding Capsule Network with K-Means Routing for Text Classification
Hao Ren
Hong-wei Lu
24
53
0
22 Oct 2018
pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence
  Inference
pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference
Mandar Joshi
Eunsol Choi
Omer Levy
Daniel S. Weld
Luke Zettlemoyer
CoGe
22
47
0
20 Oct 2018
Large-scale Hierarchical Alignment for Data-driven Text Rewriting
Large-scale Hierarchical Alignment for Data-driven Text Rewriting
Nikola I. Nikolov
Richard H. R. Hahnloser
46
7
0
18 Oct 2018
O2A: One-shot Observational learning with Action vectors
O2A: One-shot Observational learning with Action vectors
Leo Pauly
Wisdom C. Agboh
David C. Hogg
R. Fuentes
57
9
0
17 Oct 2018
A Span-Extraction Dataset for Chinese Machine Reading Comprehension
A Span-Extraction Dataset for Chinese Machine Reading Comprehension
Yiming Cui
Ting Liu
Wanxiang Che
Li Xiao
Zhipeng Chen
Wentao Ma
Shijin Wang
Guoping Hu
41
182
0
17 Oct 2018
Multi-Source Cross-Lingual Model Transfer: Learning What to Share
Multi-Source Cross-Lingual Model Transfer: Learning What to Share
Xilun Chen
Ahmed Hassan Awadallah
Hany Hassan
Wei Wang
Claire Cardie
36
20
0
08 Oct 2018
Neural Approaches to Conversational AI
Neural Approaches to Conversational AI
Jianfeng Gao
Michel Galley
Lihong Li
49
670
0
21 Sep 2018
Multi-task Learning with Sample Re-weighting for Machine Reading
  Comprehension
Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension
Yichong Xu
Xiaodong Liu
Yelong Shen
Jingjing Liu
Jianfeng Gao
23
51
0
18 Sep 2018
RumourEval 2019: Determining Rumour Veracity and Support for Rumours
RumourEval 2019: Determining Rumour Veracity and Support for Rumours
G. Gorrell
Kalina Bontcheva
Leon Derczynski
E. Kochkina
Maria Liakata
A. Zubiaga
15
213
0
18 Sep 2018
Texar: A Modularized, Versatile, and Extensible Toolkit for Text
  Generation
Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation
Zhiting Hu
Haoran Shi
Bowen Tan
Wentao Wang
Zichao Yang
...
Zhengzhong Liu
Xiaodan Liang
Wangrong Zhu
Devendra Singh Sachan
Eric Xing
VLM
25
56
0
04 Sep 2018
WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive
  Meaning Representations
WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations
Mohammad Taher Pilehvar
Jose Camacho-Collados
16
468
0
28 Aug 2018
The Influence of Down-Sampling Strategies on SVD Word Embedding
  Stability
The Influence of Down-Sampling Strategies on SVD Word Embedding Stability
Johannes Hellrich
B. Kampe
U. Hahn
22
10
0
21 Aug 2018
Like a Baby: Visually Situated Neural Language Acquisition
Like a Baby: Visually Situated Neural Language Acquisition
Alexander Ororbia
A. Mali
Mary Alexandria Kelly
David Reitter
31
4
0
29 May 2018
Explainable Recommendation: A Survey and New Perspectives
Explainable Recommendation: A Survey and New Perspectives
Yongfeng Zhang
Xu Chen
XAI
LRM
52
866
0
30 Apr 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
304
7,005
0
20 Apr 2018
Interact and Decide: Medley of Sub-Attention Networks for Effective
  Group Recommendation
Interact and Decide: Medley of Sub-Attention Networks for Effective Group Recommendation
Lucas Vinh Tran
T. Pham
Yi Tay
Yiding Liu
Gao Cong
Xiaoli Li
27
93
0
12 Apr 2018
Clinical Concept Embeddings Learned from Massive Sources of Multimodal
  Medical Data
Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data
Andrew L. Beam
Benjamin Kompa
A. Schmaltz
Inbar Fried
G. Weber
N. Palmer
Xu Shi
Tianxi Cai
I. Kohane
24
176
0
04 Apr 2018
The Geometry of Culture: Analyzing Meaning through Word Embeddings
The Geometry of Culture: Analyzing Meaning through Word Embeddings
Austin C. Kozlowski
Matt Taddy
James A. Evans
35
379
0
25 Mar 2018
SparCML: High-Performance Sparse Communication for Machine Learning
SparCML: High-Performance Sparse Communication for Machine Learning
Cédric Renggli
Saleh Ashkboos
Mehdi Aghagolzadeh
Dan Alistarh
Torsten Hoefler
29
126
0
22 Feb 2018
Simple and Effective Dimensionality Reduction for Word Embeddings
Simple and Effective Dimensionality Reduction for Word Embeddings
Vikas Raunak
22
101
0
11 Aug 2017
Recent Trends in Deep Learning Based Natural Language Processing
Recent Trends in Deep Learning Based Natural Language Processing
Tom Young
Devamanyu Hazarika
Soujanya Poria
Min Zhang
35
2,824
0
09 Aug 2017
Evolving Deep Neural Networks
Evolving Deep Neural Networks
Risto Miikkulainen
J. Liang
Elliot Meyerson
Aditya Rawal
Daniel Fink
...
B. Raju
H. Shahrzad
Arshak Navruzyan
Nigel P. Duffy
B. Hodjat
42
884
0
01 Mar 2017
Symbolic, Distributed and Distributional Representations for Natural
  Language Processing in the Era of Deep Learning: a Survey
Symbolic, Distributed and Distributional Representations for Natural Language Processing in the Era of Deep Learning: a Survey
L. Ferrone
Fabio Massimo Zanzotto
39
37
0
02 Feb 2017
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Zhehuai Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
718
6,750
0
26 Sep 2016
A Decomposable Attention Model for Natural Language Inference
A Decomposable Attention Model for Natural Language Inference
Ankur P. Parikh
Oscar Täckström
Dipanjan Das
Jakob Uszkoreit
213
1,367
0
06 Jun 2016
Quantifying the probable approximation error of probabilistic inference
  programs
Quantifying the probable approximation error of probabilistic inference programs
Marco F. Cusumano-Towner
Vikash K. Mansinghka
33
7
0
31 May 2016
Impact of Power System Partitioning on the Efficiency of Distributed
  Multi-Step Optimization
Impact of Power System Partitioning on the Efficiency of Distributed Multi-Step Optimization
Dongliang Chen
A. Bucchiarone
Zhihan Lv
23
12
0
31 May 2016
Previous
123...365366367