Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04805
Cited By
v1
v2 (latest)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
50 / 23,708 papers shown
Title
Contradiction Detection in Persian Text
Zeinab Rahimi
M. Shamsfard
51
6
0
05 Jul 2021
DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling
Lanqing Xue
Kaitao Song
Duocai Wu
Xu Tan
N. Zhang
Tao Qin
Weiqiang Zhang
Tie-Yan Liu
88
38
0
05 Jul 2021
Doing Good or Doing Right? Exploring the Weakness of Commonsense Causal Reasoning Models
Mingyue Han
Yinglin Wang
LRM
75
11
0
05 Jul 2021
KAISA: An Adaptive Second-Order Optimizer Framework for Deep Neural Networks
J. G. Pauloski
Qi Huang
Lei Huang
Shivaram Venkataraman
Kyle Chard
Ian Foster
Zhao-jie Zhang
86
29
0
04 Jul 2021
End-to-end Neural Coreference Resolution Revisited: A Simple yet Effective Baseline
T. Lai
Trung Bui
Doo Soon Kim
75
12
0
04 Jul 2021
Coarse-to-Careful: Seeking Semantic-related Knowledge for Open-domain Commonsense Question Answering
Luxi Xing
Yue Hu
Jing Yu
Yuqiang Xie
Wei Peng
31
0
0
04 Jul 2021
CasEE: A Joint Learning Framework with Cascade Decoding for Overlapping Event Extraction
Shuaiyi Nie
Shu Guo
Yu Bowen
Qian Li
Yiming Hei
Lihong Wang
Tingwen Liu
Hongbo Xu
38
61
0
04 Jul 2021
Audio-Oriented Multimodal Machine Comprehension: Task, Dataset and Model
Zhiqi Huang
Fenglin Liu
Xian Wu
Shen Ge
Helin Wang
Wei Fan
Yuexian Zou
AuLLM
59
2
0
04 Jul 2021
BAGUA: Scaling up Distributed Learning with System Relaxations
Shaoduo Gan
Xiangru Lian
Rui Wang
Jianbin Chang
Chengjun Liu
...
Jiawei Jiang
Binhang Yuan
Sen Yang
Ji Liu
Ce Zhang
98
30
0
03 Jul 2021
Neural-Symbolic Solver for Math Word Problems with Auxiliary Tasks
Jinghui Qin
Xiaodan Liang
Yining Hong
Jianheng Tang
Liang Lin
AIMat
AAML
105
57
0
03 Jul 2021
TagRec: Automated Tagging of Questions with Hierarchical Learning Taxonomy
Venktesh V
Mukesh Mohania
Vikram Goyal
45
7
0
03 Jul 2021
Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation
Zhiwei Hao
Jianyuan Guo
Ding Jia
Kai Han
Yehui Tang
Chao Zhang
Dacheng Tao
Yunhe Wang
ViT
149
73
0
03 Jul 2021
Solving Machine Learning Problems
Sunny Tran
P. Krishna
Ishan Pakuwal
Prabhakar Kafle
Nikhil Singh
J. Lynch
Iddo Drori
VLM
120
11
0
02 Jul 2021
Language Identification of Hindi-English tweets using code-mixed BERT
M. Z. Ansari
M. Beg
Tanvir Ahmad
Mohd Jazib Khan
Ghazali Wasim
67
14
0
02 Jul 2021
Multimodal Representation for Neural Code Search
Jian Gu
Zimin Chen
Monperrus Martin
78
43
0
02 Jul 2021
Ultrasound Video Transformers for Cardiac Ejection Fraction Estimation
Hadrien Reynaud
Athanasios Vlontzos
Benjamin Hou
A. Beqiri
Paul Leeson
Bernhard Kainz
MedIm
ViT
81
59
0
02 Jul 2021
R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling
Xiang Hu
Haitao Mi
Zujie Wen
Yafang Wang
Yi Su
Jing Zheng
Gerard de Melo
77
23
0
02 Jul 2021
ResIST: Layer-Wise Decomposition of ResNets for Distributed Training
Chen Dun
Cameron R. Wolfe
C. Jermaine
Anastasios Kyrillidis
95
21
0
02 Jul 2021
Online Metro Origin-Destination Prediction via Heterogeneous Information Aggregation
Lingbo Liu
Yuying Zhu
Guanbin Li
Ziyi Wu
Lei Bai
Liang Lin
AI4TS
111
37
0
02 Jul 2021
Learned Token Pruning for Transformers
Sehoon Kim
Sheng Shen
D. Thorsley
A. Gholami
Woosuk Kwon
Joseph Hassoun
Kurt Keutzer
91
157
0
02 Jul 2021
The Spotlight: A General Method for Discovering Systematic Errors in Deep Learning Models
G. dÉon
Jason dÉon
J. R. Wright
Kevin Leyton-Brown
86
76
0
01 Jul 2021
An Investigation of the (In)effectiveness of Counterfactually Augmented Data
Nitish Joshi
He He
OODD
105
47
0
01 Jul 2021
Neural Task Success Classifiers for Robotic Manipulation from Few Real Demonstrations
A. Mohtasib
Amir Ghalamzan
Nicola Bellotto
Heriberto Cuay´ahuitl
56
1
0
01 Jul 2021
A Primer on Pretrained Multilingual Language Models
Sumanth Doddapaneni
Gowtham Ramesh
Mitesh M. Khapra
Anoop Kunchukuttan
Pratyush Kumar
LRM
123
76
0
01 Jul 2021
AutoFormer: Searching Transformers for Visual Recognition
Minghao Chen
Houwen Peng
Jianlong Fu
Haibin Ling
ViT
117
268
0
01 Jul 2021
Focal Self-attention for Local-Global Interactions in Vision Transformers
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
ViT
129
437
0
01 Jul 2021
ESPnet-ST IWSLT 2021 Offline Speech Translation System
Hirofumi Inaguma
Shun Kiyono
Nelson Enrique Yalta Soplin
Pengcheng Guo
Jun Suzuki
Kevin Duh
Shinji Watanabe
3DV
74
2
0
01 Jul 2021
Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition
Vittorio Mazzia
Simone Angarano
Francesco Salvetti
Federico Angelini
Marcello Chiaberge
ViT
129
143
0
01 Jul 2021
Productivity, Portability, Performance: Data-Centric Python
Yiheng Wang
Yao Zhang
Yanzhang Wang
Yan Wan
Jiao Wang
Zhongyuan Wu
Yuhao Yang
Bowen She
175
101
0
01 Jul 2021
Improving Human Motion Prediction Through Continual Learning
M. S. Yasar
Tariq Iqbal
3DH
45
15
0
01 Jul 2021
Automatic Metadata Extraction Incorporating Visual Features from Scanned Electronic Theses and Dissertations
Muntabir Hasan Choudhury
Himarsha R. Jayanetti
Jian Wu
William A. Ingram
Edward A. Fox
38
10
0
01 Jul 2021
VideoLightFormer: Lightweight Action Recognition using Transformers
Raivo Koot
Haiping Lu
ViT
135
6
0
01 Jul 2021
CLINE: Contrastive Learning with Semantic Negative Examples for Natural Language Understanding
Dong Wang
Ning Ding
Pijian Li
Haitao Zheng
AAML
72
118
0
01 Jul 2021
MultiCite: Modeling realistic citations requires moving beyond the single-sentence single-label setting
Anne Lauscher
B. Ko
Bailey Kuehl
Sophie Johnson
David Jurgens
Arman Cohan
Kyle Lo
HAI
64
43
0
01 Jul 2021
Knowledge Distillation for Quality Estimation
Amit Gajbhiye
M. Fomicheva
Fernando Alva-Manchego
Frédéric Blain
A. Obamuyide
Nikolaos Aletras
Lucia Specia
86
11
0
01 Jul 2021
Ensemble Learning-Based Approach for Improving Generalization Capability of Machine Reading Comprehension Systems
Razieh Baradaran
Hossein Amirkhani
70
16
0
01 Jul 2021
Combining Feature and Instance Attribution to Detect Artifacts
Pouya Pezeshkpour
Sarthak Jain
Sameer Singh
Byron C. Wallace
TDI
135
42
0
01 Jul 2021
Leveraging Domain Agnostic and Specific Knowledge for Acronym Disambiguation
Qiwei Zhong
Guanxiong Zeng
Danqing Zhu
Yang Zhang
Wangli Lin
Ben Chen
Jiayu Tang
102
11
0
01 Jul 2021
Scientia Potentia Est -- On the Role of Knowledge in Computational Argumentation
Anne Lauscher
Henning Wachsmuth
Iryna Gurevych
Goran Glavaš
109
34
0
01 Jul 2021
AdaXpert: Adapting Neural Architecture for Growing Data
Shuaicheng Niu
Jiaxiang Wu
Guanghui Xu
Yifan Zhang
Yong Guo
P. Zhao
Peng Wang
Mingkui Tan
106
14
0
01 Jul 2021
OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation
Jing Liu
Xinxin Zhu
Fei Liu
Longteng Guo
Zijia Zhao
...
Weining Wang
Hanqing Lu
Shiyu Zhou
Jiajun Zhang
Jinqiao Wang
98
38
0
01 Jul 2021
Multi-modal Graph Learning for Disease Prediction
Shuai Zheng
Zhenfeng Zhu
Zhizhe Liu
Zhenyu Guo
Yang Liu
Yao Zhao
92
105
0
01 Jul 2021
Capturing Event Argument Interaction via A Bi-Directional Entity-Level Recurrent Decoder
Xiangyu Xi
Wei Ye
Shikun Zhang
Quanxiu Wang
Huixing Jiang
Wei Wu
73
24
0
01 Jul 2021
Elbert: Fast Albert with Confidence-Window Based Early Exit
Keli Xie
Siyuan Lu
Meiqi Wang
Zhongfeng Wang
60
20
0
01 Jul 2021
Cross-Lingual Transfer Learning for Statistical Type Inference
Zhiming Li
Xiaofei Xie
Haoliang Li
Yulong Shen
Yi Li
Yang Liu
108
2
0
01 Jul 2021
Attention Bottlenecks for Multimodal Fusion
Arsha Nagrani
Shan Yang
Anurag Arnab
A. Jansen
Cordelia Schmid
Chen Sun
173
576
0
30 Jun 2021
Saturated Transformers are Constant-Depth Threshold Circuits
William Merrill
Ashish Sabharwal
Noah A. Smith
127
107
0
30 Jun 2021
Early Risk Detection of Pathological Gambling, Self-Harm and Depression Using BERT
Ana-Maria Bucur
Adrian Cosma
Liviu P. Dinu
46
37
0
30 Jun 2021
Revisiting the Primacy of English in Zero-shot Cross-lingual Transfer
Iulia Turc
Kenton Lee
Jacob Eisenstein
Ming-Wei Chang
Kristina Toutanova
51
58
0
30 Jun 2021
The MultiBERTs: BERT Reproductions for Robustness Analysis
Thibault Sellam
Steve Yadlowsky
Jason W. Wei
Naomi Saphra
Alexander DÁmour
...
Iulia Turc
Jacob Eisenstein
Dipanjan Das
Ian Tenney
Ellie Pavlick
132
95
0
30 Jun 2021
Previous
1
2
3
...
320
321
322
...
473
474
475
Next