ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 23,639 papers shown
Title
Question Answering Over Temporal Knowledge Graphs
Question Answering Over Temporal Knowledge Graphs
Apoorv Saxena
Soumen Chakrabarti
Partha P. Talukdar
AI4MH
118
139
0
03 Jun 2021
Transformers are Deep Infinite-Dimensional Non-Mercer Binary Kernel
  Machines
Transformers are Deep Infinite-Dimensional Non-Mercer Binary Kernel Machines
Matthew A. Wright
Joseph E. Gonzalez
86
23
0
02 Jun 2021
Ember: No-Code Context Enrichment via Similarity-Based Keyless Joins
Ember: No-Code Context Enrichment via Similarity-Based Keyless Joins
S. Suri
Ihab F. Ilyas
Christopher Ré
Theodoros Rekatsinas
51
22
0
02 Jun 2021
Evaluating the Efficacy of Summarization Evaluation across Languages
Evaluating the Efficacy of Summarization Evaluation across Languages
Fajri Koto
Jey Han Lau
Timothy Baldwin
114
19
0
02 Jun 2021
SMURF: SeMantic and linguistic UndeRstanding Fusion for Caption
  Evaluation via Typicality Analysis
SMURF: SeMantic and linguistic UndeRstanding Fusion for Caption Evaluation via Typicality Analysis
Joshua Forster Feinglass
Yezhou Yang
60
22
0
02 Jun 2021
Container: Context Aggregation Network
Container: Context Aggregation Network
Peng Gao
Jiasen Lu
Hongsheng Li
Roozbeh Mottaghi
Aniruddha Kembhavi
ViT
108
72
0
02 Jun 2021
Decision Transformer: Reinforcement Learning via Sequence Modeling
Decision Transformer: Reinforcement Learning via Sequence Modeling
Lili Chen
Kevin Lu
Aravind Rajeswaran
Kimin Lee
Aditya Grover
Michael Laskin
Pieter Abbeel
A. Srinivas
Igor Mordatch
OffRL
213
1,671
0
02 Jun 2021
SAINT: Improved Neural Networks for Tabular Data via Row Attention and
  Contrastive Pre-Training
SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training
Gowthami Somepalli
Micah Goldblum
Avi Schwarzschild
C. Bayan Bruss
Tom Goldstein
LMTD
119
336
0
02 Jun 2021
On the Distribution, Sparsity, and Inference-time Quantization of
  Attention Values in Transformers
On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers
Tianchu Ji
Shraddhan Jain
M. Ferdman
Peter Milder
H. Andrew Schwartz
Niranjan Balasubramanian
MQ
113
16
0
02 Jun 2021
More Identifiable yet Equally Performant Transformers for Text
  Classification
More Identifiable yet Equally Performant Transformers for Text Classification
Rishabh Bhardwaj
Navonil Majumder
Soujanya Poria
Eduard H. Hovy
32
6
0
02 Jun 2021
Metaphor Generation with Conceptual Mappings
Metaphor Generation with Conceptual Mappings
Kevin Stowe
Tuhin Chakrabarty
Nanyun Peng
Smaranda Muresan
Iryna Gurevych
61
51
0
02 Jun 2021
A Unified Generative Framework for Various NER Subtasks
A Unified Generative Framework for Various NER Subtasks
Hang Yan
Tao Gui
Junqi Dai
Qipeng Guo
Zheng Zhang
Xipeng Qiu
96
298
0
02 Jun 2021
Differential Privacy for Text Analytics via Natural Text Sanitization
Differential Privacy for Text Analytics via Natural Text Sanitization
Xiang Yue
Minxin Du
Tianhao Wang
Yaliang Li
Huan Sun
Sherman S. M. Chow
112
86
0
02 Jun 2021
Uncovering Constraint-Based Behavior in Neural Models via Targeted
  Fine-Tuning
Uncovering Constraint-Based Behavior in Neural Models via Targeted Fine-Tuning
Forrest Davis
Marten van Schijndel
AI4CE
60
7
0
02 Jun 2021
Topic-Aware Evidence Reasoning and Stance-Aware Aggregation for Fact
  Verification
Topic-Aware Evidence Reasoning and Stance-Aware Aggregation for Fact Verification
Jiasheng Si
Deyu Zhou
Tong Li
Xingyu Shi
Yulan He
71
39
0
02 Jun 2021
Self-Supervised Document Similarity Ranking via Contextualized Language
  Models and Hierarchical Inference
Self-Supervised Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference
Dvir Ginzburg
Itzik Malkiel
Oren Barkan
Avi Caciularu
Noam Koenigstein
RALM
75
27
0
02 Jun 2021
End-to-End NLP Knowledge Graph Construction
End-to-End NLP Knowledge Graph Construction
Ishani Mondal
Yufang Hou
Charles Jochim
54
33
0
02 Jun 2021
Towards Deeper Deep Reinforcement Learning with Spectral Normalization
Towards Deeper Deep Reinforcement Learning with Spectral Normalization
Johan Bjorck
Carla P. Gomes
Kilian Q. Weinberger
108
23
0
02 Jun 2021
DynaEval: Unifying Turn and Dialogue Level Evaluation
DynaEval: Unifying Turn and Dialogue Level Evaluation
Chen Zhang
Yiming Chen
L. F. D’Haro
Yan Zhang
Thomas Friedrichs
Grandee Lee
Haizhou Li
78
74
0
02 Jun 2021
T-BERT -- Model for Sentiment Analysis of Micro-blogs Integrating Topic
  Model and BERT
T-BERT -- Model for Sentiment Analysis of Micro-blogs Integrating Topic Model and BERT
Sarojadevi Palani
P. Rajagopal
Sidharth Pancholi
38
12
0
02 Jun 2021
Learning to Rehearse in Long Sequence Memorization
Learning to Rehearse in Long Sequence Memorization
Zhu Zhang
Chang Zhou
Jianxin Ma
Zhijie Lin
Jingren Zhou
Hongxia Yang
Zhou Zhao
RALM
33
9
0
02 Jun 2021
LGESQL: Line Graph Enhanced Text-to-SQL Model with Mixed Local and
  Non-Local Relations
LGESQL: Line Graph Enhanced Text-to-SQL Model with Mixed Local and Non-Local Relations
Ruisheng Cao
Lu Chen
Zhi Chen
Yanbin Zhao
Su Zhu
Kai Yu
77
168
0
02 Jun 2021
belabBERT: a Dutch RoBERTa-based language model applied to psychiatric
  classification
belabBERT: a Dutch RoBERTa-based language model applied to psychiatric classification
J. Wouts
J. D. Boer
A. Voppel
S. Brederoo
S. V. Splunter
I. Sommer
38
4
0
02 Jun 2021
Generating Informative Conclusions for Argumentative Texts
Generating Informative Conclusions for Argumentative Texts
S. Syed
Khalid Al Khatib
Milad Alshomary
Henning Wachsmuth
Martin Potthast
47
24
0
02 Jun 2021
Rethinking Cross-modal Interaction from a Top-down Perspective for
  Referring Video Object Segmentation
Rethinking Cross-modal Interaction from a Top-down Perspective for Referring Video Object Segmentation
Chen Liang
Yu Wu
Tianfei Zhou
Wenguan Wang
Zongxin Yang
Yunchao Wei
Yi Yang
VOS
107
50
0
02 Jun 2021
Hi-Transformer: Hierarchical Interactive Transformer for Efficient and
  Effective Long Document Modeling
Hi-Transformer: Hierarchical Interactive Transformer for Efficient and Effective Long Document Modeling
Chuhan Wu
Fangzhao Wu
Tao Qi
Yongfeng Huang
125
68
0
02 Jun 2021
Why Machine Reading Comprehension Models Learn Shortcuts?
Why Machine Reading Comprehension Models Learn Shortcuts?
Yuxuan Lai
Chen Zhang
Yansong Feng
Quzhe Huang
Dongyan Zhao
SyDa
86
53
0
02 Jun 2021
One Teacher is Enough? Pre-trained Language Model Distillation from
  Multiple Teachers
One Teacher is Enough? Pre-trained Language Model Distillation from Multiple Teachers
Chuhan Wu
Fangzhao Wu
Yongfeng Huang
72
65
0
02 Jun 2021
SocAoG: Incremental Graph Parsing for Social Relation Inference in
  Dialogues
SocAoG: Incremental Graph Parsing for Social Relation Inference in Dialogues
Liang Qiu
Yuan Liang
Yizhou Zhao
Pan Lu
Baolin Peng
Zhou Yu
Ying Nian Wu
Song-Chun Zhu
75
18
0
02 Jun 2021
End-to-End Hierarchical Relation Extraction for Generic Form
  Understanding
End-to-End Hierarchical Relation Extraction for Generic Form Understanding
Tuan-Anh Dang Nguyen
Duc Thanh Hoang
Q. Tran
Chih-Wei Pan
T. Nguyen
86
10
0
02 Jun 2021
A Span Extraction Approach for Information Extraction on Visually-Rich
  Documents
A Span Extraction Approach for Information Extraction on Visually-Rich Documents
Tuan-Anh Dang Nguyen
Hieu M. Vu
Nguyen Hong Son
Minh-Tien Nguyen
52
6
0
02 Jun 2021
Exploring Discourse Structures for Argument Impact Classification
Exploring Discourse Structures for Argument Impact Classification
Xin Liu
Jiefu Ou
Yangqiu Song
Xin Jiang
56
12
0
02 Jun 2021
COM2SENSE: A Commonsense Reasoning Benchmark with Complementary
  Sentences
COM2SENSE: A Commonsense Reasoning Benchmark with Complementary Sentences
Shikhar Singh
Nuan Wen
Yu Hou
Pegah Alipoormolabashi
Te-Lin Wu
Xuezhe Ma
Nanyun Peng
LRM
104
59
0
02 Jun 2021
RevCore: Review-augmented Conversational Recommendation
RevCore: Review-augmented Conversational Recommendation
Yu Lu
Junwei Bao
Yan Song
Zichen Ma
Shuguang Cui
Youzheng Wu
Xiaodong He
131
77
0
02 Jun 2021
Unsupervised Out-of-Domain Detection via Pre-trained Transformers
Unsupervised Out-of-Domain Detection via Pre-trained Transformers
Keyang Xu
Zhaolin Ren
Shikun Zhang
Yihao Feng
Caiming Xiong
ViT
81
41
0
02 Jun 2021
Self-Training Sampling with Monolingual Data Uncertainty for Neural
  Machine Translation
Self-Training Sampling with Monolingual Data Uncertainty for Neural Machine Translation
Wenxiang Jiao
Xing Wang
Zhaopeng Tu
Shuming Shi
Michael R. Lyu
Irwin King
UQLM
65
35
0
02 Jun 2021
Discrete Cosine Transform as Universal Sentence Encoder
Discrete Cosine Transform as Universal Sentence Encoder
Nada AlMarwani
Mona T. Diab
43
2
0
02 Jun 2021
OntoGUM: Evaluating Contextualized SOTA Coreference Resolution on 12
  More Genres
OntoGUM: Evaluating Contextualized SOTA Coreference Resolution on 12 More Genres
Yilun Zhu
Sameer Pradhan
Amir Zeldes
73
22
0
02 Jun 2021
DialoGraph: Incorporating Interpretable Strategy-Graph Networks into
  Negotiation Dialogues
DialoGraph: Incorporating Interpretable Strategy-Graph Networks into Negotiation Dialogues
Rishabh Joshi
Vidhisha Balachandran
Shikhar Vashishth
A. Black
Yulia Tsvetkov
86
36
0
02 Jun 2021
Rejuvenating Low-Frequency Words: Making the Most of Parallel Data in
  Non-Autoregressive Translation
Rejuvenating Low-Frequency Words: Making the Most of Parallel Data in Non-Autoregressive Translation
Liang Ding
Longyue Wang
Xuebo Liu
Derek F. Wong
Dacheng Tao
Zhaopeng Tu
89
48
0
02 Jun 2021
Efficient Passage Retrieval with Hashing for Open-domain Question
  Answering
Efficient Passage Retrieval with Hashing for Open-domain Question Answering
Ikuya Yamada
Akari Asai
Hannaneh Hajishirzi
MQ
94
82
0
02 Jun 2021
Conversational Question Answering: A Survey
Conversational Question Answering: A Survey
Munazza Zaib
Wei Emma Zhang
Quan Z. Sheng
A. Mahmood
Yang Zhang
89
91
0
02 Jun 2021
On the Efficacy of Adversarial Data Collection for Question Answering:
  Results from a Large-Scale Randomized Study
On the Efficacy of Adversarial Data Collection for Question Answering: Results from a Large-Scale Randomized Study
Divyansh Kaushik
Douwe Kiela
Zachary Chase Lipton
Wen-tau Yih
AAML
72
35
0
02 Jun 2021
Claim Matching Beyond English to Scale Global Fact-Checking
Claim Matching Beyond English to Scale Global Fact-Checking
Ashkan Kazemi
Kiran Garimella
Devin Gaffney
Scott A. Hale
77
60
0
01 Jun 2021
Comparing Test Sets with Item Response Theory
Comparing Test Sets with Item Response Theory
Clara Vania
Phu Mon Htut
William Huang
Dhara Mungra
Richard Yuanzhe Pang
Jason Phang
Haokun Liu
Kyunghyun Cho
Sam Bowman
77
43
0
01 Jun 2021
ConvoSumm: Conversation Summarization Benchmark and Improved Abstractive
  Summarization with Argument Mining
ConvoSumm: Conversation Summarization Benchmark and Improved Abstractive Summarization with Argument Mining
Alexander R. Fabbri
Faiaz Rahman
Imad Rizvi
Borui Wang
Haoran Li
Yashar Mehdad
Dragomir R. Radev
96
65
0
01 Jun 2021
Weighting vectors for machine learning: numerical harmonic analysis
  applied to boundary detection
Weighting vectors for machine learning: numerical harmonic analysis applied to boundary detection
Eric Bunch
Jeffery Kline
Dan Dickinson
Suhaas Bhat
G. Fung
51
8
0
01 Jun 2021
On using distributed representations of source code for the detection of
  C security vulnerabilities
On using distributed representations of source code for the detection of C security vulnerabilities
D. Coimbra
Sofia Reis
Rui Abreu
Corina Puasuareanu
H. Erdogmus
70
19
0
01 Jun 2021
What Ingredients Make for an Effective Crowdsourcing Protocol for
  Difficult NLU Data Collection Tasks?
What Ingredients Make for an Effective Crowdsourcing Protocol for Difficult NLU Data Collection Tasks?
Nikita Nangia
Saku Sugawara
H. Trivedi
Alex Warstadt
Clara Vania
Sam Bowman
136
36
0
01 Jun 2021
DYPLOC: Dynamic Planning of Content Using Mixed Language Models for Text
  Generation
DYPLOC: Dynamic Planning of Content Using Mixed Language Models for Text Generation
Xinyu Hua
Ashwin Sreevatsa
Lu Wang
43
23
0
01 Jun 2021
Previous
123...330331332...471472473
Next