ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLM
    SSL
    SSeg
ArXivPDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 19,767 papers shown
Title
Learning Multi-Sense Word Distributions using Approximate
  Kullback-Leibler Divergence
Learning Multi-Sense Word Distributions using Approximate Kullback-Leibler Divergence
P. Jayashree
Ballijepalli Shreya
P. K. Srijith
26
2
0
12 Nov 2019
A Syntax-aware Multi-task Learning Framework for Chinese Semantic Role
  Labeling
A Syntax-aware Multi-task Learning Framework for Chinese Semantic Role Labeling
Qingrong Xia
Zhenghua Li
Min Zhang
33
17
0
12 Nov 2019
Understanding BERT performance in propaganda analysis
Understanding BERT performance in propaganda analysis
Yiqing Hua
30
16
0
11 Nov 2019
Attending to Entities for Better Text Understanding
Attending to Entities for Better Text Understanding
Pengxiang Cheng
K. Erk
LRM
24
37
0
11 Nov 2019
Deep Contextualized Self-training for Low Resource Dependency Parsing
Deep Contextualized Self-training for Low Resource Dependency Parsing
Guy Rotman
Roi Reichart
43
50
0
11 Nov 2019
A hybrid text normalization system using multi-head self-attention for
  mandarin
A hybrid text normalization system using multi-head self-attention for mandarin
Junhui Zhang
Junjie Pan
Xiang Yin
Chen Li
Shichao Liu
Yang Zhang
Yuxuan Wang
Zejun Ma
AI4CE
21
15
0
11 Nov 2019
TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer
  Sentence Selection
TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection
Siddhant Garg
Thuy Vu
Alessandro Moschitti
41
214
0
11 Nov 2019
Improving BERT Fine-tuning with Embedding Normalization
Wenxuan Zhou
Junyi Du
Xiang Ren
21
6
0
10 Nov 2019
Can Monolingual Pretrained Models Help Cross-Lingual Classification?
Can Monolingual Pretrained Models Help Cross-Lingual Classification?
Zewen Chi
Li Dong
Furu Wei
Xian-Ling Mao
Heyan Huang
LRM
VLM
30
13
0
10 Nov 2019
Effectiveness of self-supervised pre-training for speech recognition
Effectiveness of self-supervised pre-training for speech recognition
Alexei Baevski
Michael Auli
Abdel-rahman Mohamed
SSL
32
147
0
10 Nov 2019
Efficient Dialogue State Tracking by Selectively Overwriting Memory
Efficient Dialogue State Tracking by Selectively Overwriting Memory
Sungdong Kim
Sohee Yang
Gyuwan Kim
Sang-Woo Lee
39
195
0
10 Nov 2019
Two-Headed Monster And Crossed Co-Attention Networks
Two-Headed Monster And Crossed Co-Attention Networks
Yaoyiran Li
Jing Jiang
32
0
0
10 Nov 2019
A Bilingual Generative Transformer for Semantic Sentence Embedding
A Bilingual Generative Transformer for Semantic Sentence Embedding
John Wieting
Graham Neubig
Taylor Berg-Kirkpatrick
43
28
0
10 Nov 2019
CamemBERT: a Tasty French Language Model
CamemBERT: a Tasty French Language Model
Louis Martin
Benjamin Muller
Pedro Ortiz Suarez
Yoann Dupont
Laurent Romary
Eric Villemonte de la Clergerie
Djamé Seddah
Benoît Sagot
56
960
0
10 Nov 2019
Pre-train and Plug-in: Flexible Conditional Text Generation with
  Variational Auto-Encoders
Pre-train and Plug-in: Flexible Conditional Text Generation with Variational Auto-Encoders
Yu Duan
Canwen Xu
Jiaxin Pei
Jialong Han
Chenliang Li
35
42
0
10 Nov 2019
Dynamic Neuro-Symbolic Knowledge Graph Construction for Zero-shot
  Commonsense Question Answering
Dynamic Neuro-Symbolic Knowledge Graph Construction for Zero-shot Commonsense Question Answering
Antoine Bosselut
Ronan Le Bras
Yejin Choi
NAI
35
41
0
10 Nov 2019
Rethinking Self-Attention: Towards Interpretability in Neural Parsing
Rethinking Self-Attention: Towards Interpretability in Neural Parsing
Khalil Mrini
Franck Dernoncourt
Quan Tran
Trung Bui
W. Chang
Ndapandula Nakashole
MILM
LRM
23
29
0
10 Nov 2019
Knowledge Guided Named Entity Recognition for BioMedical Text
Knowledge Guided Named Entity Recognition for BioMedical Text
Pratyay Banerjee
Kuntal Kumar Pal
M. Devarakonda
Chitta Baral
29
0
0
10 Nov 2019
Improving Transformer Models by Reordering their Sublayers
Improving Transformer Models by Reordering their Sublayers
Ofir Press
Noah A. Smith
Omer Levy
27
87
0
10 Nov 2019
Learning to Few-Shot Learn Across Diverse Natural Language
  Classification Tasks
Learning to Few-Shot Learn Across Diverse Natural Language Classification Tasks
Trapit Bansal
Rishikesh Jha
Andrew McCallum
SSL
21
118
0
10 Nov 2019
r/Fakeddit: A New Multimodal Benchmark Dataset for Fine-grained Fake
  News Detection
r/Fakeddit: A New Multimodal Benchmark Dataset for Fine-grained Fake News Detection
Kai Nakamura
Sharon Levy
Wenjie Wang
24
122
0
10 Nov 2019
Syntax-Infused Transformer and BERT models for Machine Translation and
  Natural Language Understanding
Syntax-Infused Transformer and BERT models for Machine Translation and Natural Language Understanding
Dhanasekar Sundararaman
Vivek Subramanian
Guoyin Wang
Shijing Si
Dinghan Shen
Dong Wang
Lawrence Carin
19
40
0
10 Nov 2019
Not All Claims are Created Equal: Choosing the Right Statistical
  Approach to Assess Hypotheses
Not All Claims are Created Equal: Choosing the Right Statistical Approach to Assess Hypotheses
Erfan Sadeqi Azer
Daniel Khashabi
Ashish Sabharwal
Dan Roth
35
17
0
10 Nov 2019
CCAligned: A Massive Collection of Cross-Lingual Web-Document Pairs
CCAligned: A Massive Collection of Cross-Lingual Web-Document Pairs
Ahmed El-Kishky
Vishrav Chaudhary
Francisco Guzman
Philipp Koehn
34
199
0
10 Nov 2019
Generalizing Natural Language Analysis through Span-relation
  Representations
Generalizing Natural Language Analysis through Span-relation Representations
Zhengbao Jiang
Wenyuan Xu
Jun Araki
Graham Neubig
38
60
0
10 Nov 2019
Scalable Zero-shot Entity Linking with Dense Entity Retrieval
Scalable Zero-shot Entity Linking with Dense Entity Retrieval
Ledell Yu Wu
Fabio Petroni
Martin Josifoski
Sebastian Riedel
Luke Zettlemoyer
57
182
0
10 Nov 2019
Meta Label Correction for Noisy Label Learning
Meta Label Correction for Noisy Label Learning
Guoqing Zheng
Ahmed Hassan Awadallah
S. Dumais
NoLa
OffRL
27
179
0
10 Nov 2019
The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded
  Conversational Agents
The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents
Kurt Shuster
Da Ju
Stephen Roller
Emily Dinan
Y-Lan Boureau
Jason Weston
37
81
0
09 Nov 2019
Multi-Sentence Argument Linking
Multi-Sentence Argument Linking
Seth Ebner
Patrick Xia
Ryan Culkin
Kyle Rawlins
Benjamin Van Durme
HAI
34
159
0
09 Nov 2019
Sentence Meta-Embeddings for Unsupervised Semantic Textual Similarity
Sentence Meta-Embeddings for Unsupervised Semantic Textual Similarity
Nina Poerner
Ulli Waltinger
Hinrich Schütze
AI4TS
32
20
0
09 Nov 2019
ConveRT: Efficient and Accurate Conversational Representations from
  Transformers
ConveRT: Efficient and Accurate Conversational Representations from Transformers
Matthew Henderson
I. Casanueva
Nikola Mrkvsić
Pei-hao Su
Tsung-Hsien
Ivan Vulić
47
196
0
09 Nov 2019
E-BERT: Efficient-Yet-Effective Entity Embeddings for BERT
E-BERT: Efficient-Yet-Effective Entity Embeddings for BERT
Nina Poerner
Ulli Waltinger
Hinrich Schütze
47
157
0
09 Nov 2019
Improving Machine Reading Comprehension via Adversarial Training
Improving Machine Reading Comprehension via Adversarial Training
Ziqing Yang
Yiming Cui
Wanxiang Che
Ting Liu
Shijin Wang
Guoping Hu
32
17
0
09 Nov 2019
How Decoding Strategies Affect the Verifiability of Generated Text
How Decoding Strategies Affect the Verifiability of Generated Text
Luca Massarelli
Fabio Petroni
Aleksandra Piktus
Myle Ott
Tim Rocktaschel
Vassilis Plachouras
Fabrizio Silvestri
Sebastian Riedel
33
50
0
09 Nov 2019
On the Relationship between Self-Attention and Convolutional Layers
On the Relationship between Self-Attention and Convolutional Layers
Jean-Baptiste Cordonnier
Andreas Loukas
Martin Jaggi
45
529
0
08 Nov 2019
DZip: improved general-purpose lossless compression based on novel
  neural network modeling
DZip: improved general-purpose lossless compression based on novel neural network modeling
Mohit Goyal
Kedar Tatwawadi
Shubham Chandak
Idoia Ochoa
AI4CE
21
23
0
08 Nov 2019
Negated and Misprimed Probes for Pretrained Language Models: Birds Can
  Talk, But Cannot Fly
Negated and Misprimed Probes for Pretrained Language Models: Birds Can Talk, But Cannot Fly
Nora Kassner
Hinrich Schütze
28
318
0
08 Nov 2019
How Language-Neutral is Multilingual BERT?
How Language-Neutral is Multilingual BERT?
Jindrich Libovický
Rudolf Rosa
Alexander Fraser
24
115
0
08 Nov 2019
Not Enough Data? Deep Learning to the Rescue!
Not Enough Data? Deep Learning to the Rescue!
Ateret Anaby-Tavor
Boaz Carmeli
Esther Goldbraich
Amir Kantor
George Kour
Segev Shlomov
N. Tepper
Naama Zwerdling
32
368
0
08 Nov 2019
Pretrained Language Models for Document-Level Neural Machine Translation
Pretrained Language Models for Document-Level Neural Machine Translation
Liangyou Li
Xin Jiang
Qun Liu
36
19
0
08 Nov 2019
Relation Adversarial Network for Low Resource Knowledge Graph Completion
Relation Adversarial Network for Low Resource Knowledge Graph Completion
Ningyu Zhang
Shumin Deng
Zhanlin Sun
Jiaoayan Chen
Wei Zhang
Huajun Chen
39
71
0
08 Nov 2019
What Would Elsa Do? Freezing Layers During Transformer Fine-Tuning
What Would Elsa Do? Freezing Layers During Transformer Fine-Tuning
Jaejun Lee
Raphael Tang
Jimmy J. Lin
34
121
0
08 Nov 2019
Reducing Sentiment Bias in Language Models via Counterfactual Evaluation
Reducing Sentiment Bias in Language Models via Counterfactual Evaluation
Po-Sen Huang
Huan Zhang
Ray Jiang
Robert Stanforth
Johannes Welbl
Jack W. Rae
Vishal Maini
Dani Yogatama
Pushmeet Kohli
36
209
0
08 Nov 2019
Contrastive Multi-document Question Generation
Contrastive Multi-document Question Generation
W. Cho
Yizhe Zhang
Sudha Rao
Asli Celikyilmaz
Chenyan Xiong
Jianfeng Gao
Mengdi Wang
Bill Dolan
SyDa
32
28
0
08 Nov 2019
Certified Data Removal from Machine Learning Models
Certified Data Removal from Machine Learning Models
Chuan Guo
Tom Goldstein
Awni Y. Hannun
Laurens van der Maaten
MU
57
424
0
08 Nov 2019
Why Do Masked Neural Language Models Still Need Common Sense Knowledge?
Why Do Masked Neural Language Models Still Need Common Sense Knowledge?
Sunjae Kwon
Cheongwoong Kang
Jiyeon Han
Jaesik Choi
24
16
0
08 Nov 2019
The TechQA Dataset
The TechQA Dataset
Vittorio Castelli
Rishav Chakravarti
Saswati Dana
Anthony Ferritto
Radu Florian
...
Andrzej Sakrajda
Avirup Sil
Rosario A. Uceda-Sosa
T. Ward
Rong Zhang
34
45
0
08 Nov 2019
Blockwise Self-Attention for Long Document Understanding
Blockwise Self-Attention for Long Document Understanding
J. Qiu
Hao Ma
Omer Levy
Scott Yih
Sinong Wang
Jie Tang
24
252
0
07 Nov 2019
Probing Contextualized Sentence Representations with Visual Awareness
Probing Contextualized Sentence Representations with Visual Awareness
Zhuosheng Zhang
Rui Wang
Kehai Chen
Masao Utiyama
Eiichiro Sumita
Hai Zhao
29
2
0
07 Nov 2019
Contextualized Sparse Representations for Real-Time Open-Domain Question
  Answering
Contextualized Sparse Representations for Real-Time Open-Domain Question Answering
Jinhyuk Lee
Minjoon Seo
Hannaneh Hajishirzi
Jaewoo Kang
RALM
LRM
29
31
0
07 Nov 2019
Previous
123...377378379...394395396
Next