Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04805
Cited By
v1
v2 (latest)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
50 / 23,885 papers shown
Title
How Does Adversarial Fine-Tuning Benefit BERT?
J. Ebrahimi
Hao Yang
Wei Zhang
AAML
63
4
0
31 Aug 2021
T3-Vis: a visual analytic framework for Training and fine-Tuning Transformers in NLP
Raymond Li
Wen Xiao
Lanjun Wang
Hyeju Jang
Giuseppe Carenini
ViT
90
23
0
31 Aug 2021
AraT5: Text-to-Text Transformers for Arabic Language Generation
El Moatez Billah Nagoudi
AbdelRahim Elmadany
Muhammad Abdul-Mageed
144
126
0
31 Aug 2021
Towards Consistent Document-level Entity Linking: Joint Models for Entity Linking and Coreference Resolution
Klim Zaporojets
Johannes Deleu
Yiwei Jiang
Thomas Demeester
Chris Develder
148
10
0
30 Aug 2021
Want To Reduce Labeling Cost? GPT-3 Can Help
Shuohang Wang
Yang Liu
Yichong Xu
Chenguang Zhu
Michael Zeng
95
257
0
30 Aug 2021
Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback
HongChien Yu
Chenyan Xiong
Jamie Callan
RALM
71
71
0
30 Aug 2021
DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion
Wei Niu
Jiexiong Guan
Yanzhi Wang
G. Agrawal
Bin Ren
AI4CE
85
153
0
30 Aug 2021
N24News: A New Dataset for Multimodal News Classification
Zhen Wang
Xu Shan
Xiangxie Zhang
Jie Yang
VLM
108
38
0
30 Aug 2021
FedKD: Communication Efficient Federated Learning via Knowledge Distillation
Chuhan Wu
Fangzhao Wu
Lingjuan Lyu
Yongfeng Huang
Xing Xie
FedML
113
394
0
30 Aug 2021
AEDA: An Easier Data Augmentation Technique for Text Classification
Akbar Karimi
L. Rossi
Andrea Prati
98
158
0
30 Aug 2021
Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners
Ningyu Zhang
Luoqiu Li
Xiang Chen
Shumin Deng
Zhen Bi
Chuanqi Tan
Fei Huang
Huajun Chen
VLM
158
180
0
30 Aug 2021
CSDS: A Fine-Grained Chinese Dataset for Customer Service Dialogue Summarization
Haitao Lin
Liqun Ma
Junnan Zhu
Lu Xiang
Yu Zhou
Jiajun Zhang
Chengqing Zong
118
47
0
30 Aug 2021
Neuron-level Interpretation of Deep NLP Models: A Survey
Hassan Sajjad
Nadir Durrani
Fahim Dalvi
MILM
AI4CE
133
85
0
30 Aug 2021
Factual Consistency Evaluation for Text Summarization via Counterfactual Estimation
Yuexiang Xie
Fei Sun
Yang Deng
Yaliang Li
Bolin Ding
HILM
115
54
0
30 Aug 2021
GeoVectors: A Linked Open Corpus of OpenStreetMap Embeddings on World Scale
Nicolas Tempelmeier
Simon Gottschalk
Elena Demidova
71
15
0
30 Aug 2021
ASR-GLUE: A New Multi-task Benchmark for ASR-Robust Natural Language Understanding
Lingyun Feng
Jianwei Yu
Deng Cai
Songxiang Liu
Haitao Zheng
Yan Wang
ELM
186
14
0
30 Aug 2021
A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP
Yucheng Zhao
Guangting Wang
Chuanxin Tang
Chong Luo
Wenjun Zeng
Zhengjun Zha
93
69
0
30 Aug 2021
LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text Understanding and Generation
Jian Guan
Zhuoer Feng
Yamei Chen
Ru He
Xiaoxi Mao
Changjie Fan
Minlie Huang
120
33
0
30 Aug 2021
Selective Differential Privacy for Language Modeling
Weiyan Shi
Aiqi Cui
Evan Li
R. Jia
Zhou Yu
94
73
0
30 Aug 2021
RetroGAN: A Cyclic Post-Specialization System for Improving Out-of-Knowledge and Rare Word Representations
Pedro Colon-Hernandez
Yida Xin
H. Lieberman
Catherine Havasi
C. Breazeal
Peter Chin
GAN
KELM
54
3
0
30 Aug 2021
Fine-Grained Chemical Entity Typing with Multimodal Knowledge Representation
Chenkai Sun
Weijian Li
Jinfeng Xiao
Nikolaus Nova Parulian
ChengXiang Zhai
Heng Ji
88
4
0
29 Aug 2021
A Multimodal Framework for Video Ads Understanding
Zejia Weng
Lingjiang Meng
Rui Wang
Zuxuan Wu
Yu-Gang Jiang
52
1
0
29 Aug 2021
Span Fine-tuning for Pre-trained Language Models
Rongzhou Bao
Zhuosheng Zhang
Hai Zhao
68
2
0
29 Aug 2021
Analyzing and Mitigating Interference in Neural Architecture Search
Jin Xu
Xu Tan
Kaitao Song
Renqian Luo
Yichong Leng
Tao Qin
Tie-Yan Liu
Jian Li
MoMe
99
29
0
29 Aug 2021
Interpretable Propaganda Detection in News Articles
Seunghak Yu
Giovanni Da San Martino
Mitra Mohtarami
James R. Glass
Preslav Nakov
72
31
0
29 Aug 2021
Searching for an Effective Defender: Benchmarking Defense against Adversarial Word Substitution
Zongyi Li
Jianhan Xu
Jiehang Zeng
Linyang Li
Xiaoqing Zheng
Qi Zhang
Kai-Wei Chang
Cho-Jui Hsieh
AAML
57
74
0
29 Aug 2021
Sentence Structure and Word Relationship Modeling for Emphasis Selection
Haoran Yang
Wai Lam
40
0
0
29 Aug 2021
Oh My Mistake!: Toward Realistic Dialogue State Tracking including Turnback Utterances
Takyoung Kim
Yukyung Lee
Hoonsang Yoon
Pilsung Kang
Junseong Bang
Misuk Kim
94
3
0
28 Aug 2021
GroupFormer: Group Activity Recognition with Clustered Spatial-Temporal Transformer
Shuaicheng Li
Qianggang Cao
Lingbo Liu
Kunlin Yang
Shinan Liu
Jun Hou
Shuai Yi
ViT
99
106
0
28 Aug 2021
WALNUT: A Benchmark on Semi-weakly Supervised Learning for Natural Language Understanding
Guoqing Zheng
Giannis Karamanolakis
Kai Shu
Ahmed Hassan Awadallah
SSL
66
1
0
28 Aug 2021
Mitigation of Diachronic Bias in Fake News Detection Dataset
Taichi Murayama
Shoko Wakamiya
Eiji Aramaki
AI4CE
104
13
0
28 Aug 2021
Smoothing Dialogue States for Open Conversational Machine Reading
Zhuosheng Zhang
Siru Ouyang
Hai Zhao
Masao Utiyama
Eiichiro Sumita
86
6
0
28 Aug 2021
Self-training Improves Pre-training for Few-shot Learning in Task-oriented Dialog Systems
Fei Mi
Wanhao Zhou
Feng Cai
Lingjing Kong
Minlie Huang
Boi Faltings
105
32
0
28 Aug 2021
On the Significance of Question Encoder Sequence Model in the Out-of-Distribution Performance in Visual Question Answering
K. Gouthaman
Anurag Mittal
CML
89
0
0
28 Aug 2021
AMMASurv: Asymmetrical Multi-Modal Attention for Accurate Survival Analysis with Whole Slide Images and Gene Expression Data
Ruoqi Wang
Ziwang Huang
Haitao Wang
Hejun Wu
115
7
0
28 Aug 2021
TweetBLM: A Hate Speech Dataset and Analysis of Black Lives Matter-related Microblogs on Twitter
Sumit Kumar
Raj Ratn Pranesh
65
18
0
27 Aug 2021
Learning Inner-Group Relations on Point Clouds
Haoxi Ran
Wei Zhuo
Jing Liu
Li Lu
3DPC
111
61
0
27 Aug 2021
Code-switched inspired losses for generic spoken dialog representations
E. Chapuis
Pierre Colombo
Matthieu Labeau
Chloe Clave
179
12
0
27 Aug 2021
Automatic Text Evaluation through the Lens of Wasserstein Barycenters
Pierre Colombo
Guillaume Staerman
Chloé Clavel
Pablo Piantanida
212
41
0
27 Aug 2021
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
355
779
0
27 Aug 2021
DomiKnowS: A Library for Integration of Symbolic Domain Knowledge in Deep Learning
Hossein Rajaby Faghihi
Quan Guo
Andrzej Uszok
Aliakbar Nafar
Elaheh Raisi
Parisa Kordjamshidi
AI4CE
66
18
0
27 Aug 2021
Contrastive Mixup: Self- and Semi-Supervised learning for Tabular Domain
Sajad Darabi
Shayan Fazeli
Ali Pazoki
S. Sankararaman
Majid Sarrafzadeh
SSL
72
29
0
27 Aug 2021
Evaluating the Robustness of Neural Language Models to Input Perturbations
M. Moradi
Matthias Samwald
AAML
103
102
0
27 Aug 2021
Exploring the Capacity of a Large-scale Masked Language Model to Recognize Grammatical Errors
Ryo Nagata
Manabu Kimura
Kazuaki Hanawa
37
5
0
27 Aug 2021
A Partition Filter Network for Joint Entity and Relation Extraction
Zhiheng Yan
Chong Zhang
Jinlan Fu
Qi Zhang
Zhongyu Wei
120
140
0
27 Aug 2021
Translation Error Detection as Rationale Extraction
M. Fomicheva
Lucia Specia
Nikolaos Aletras
107
25
0
27 Aug 2021
Query-Focused Extractive Summarisation for Finding Ideal Answers to Biomedical and COVID-19 Questions
Diego Mollá Aliod
Urvashi Khanna
Dima Galat
Vincent Nguyen
Maciej Rybiński
RALM
86
2
0
27 Aug 2021
Offensive Language Identification in Low-resourced Code-mixed Dravidian languages using Pseudo-labeling
Adeep Hande
Karthik Puranik
Konthala Yasaswini
R. Priyadharshini
Sajeetha Thavareesan
Anbukkarasi Sampath
Kogilavani Shanmugavadivel
D. Thenmozhi
Bharathi Raja Chakravarthi
93
29
0
27 Aug 2021
Lyra: A Benchmark for Turducken-Style Code Generation
Qingyuan Liang
Zeyu Sun
Qihao Zhu
Wenjie Zhang
Lian Yu
Yingfei Xiong
Lu Zhang
64
13
0
27 Aug 2021
Reinforcement Learning-powered Semantic Communication via Semantic Similarity
Kun Lu
Rongpeng Li
Xianfu Chen
Zhifeng Zhao
Honggang Zhang
55
54
0
27 Aug 2021
Previous
1
2
3
...
311
312
313
...
476
477
478
Next