Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.11942
Cited By
v1
v2
v3
v4
v5
v6 (latest)
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Github (3271★)
Papers citing
"ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"
50 / 2,935 papers shown
Title
When to Foldém: How to answer Unanswerable questions
Marshall Ho
Zhipeng Zhou
J. He
55
2
0
01 May 2021
Adversarial Example Detection for DNN Models: A Review and Experimental Comparison
Ahmed Aldahdooh
W. Hamidouche
Sid Ahmed Fezza
Olivier Déforges
AAML
239
128
0
01 May 2021
Using Transformers to Provide Teachers with Personalized Feedback on their Classroom Discourse: The TalkMoves Application
Abhijit Suresh
Jennifer Jacobs
Vivian Lai
Chenhao Tan
Wayne H. Ward
James H. Martin
T. Sumner
46
30
0
29 Apr 2021
MOROCCO: Model Resource Comparison Framework
Valentin Malykh
Alexander Kukushkin
Ekaterina Artemova
Vladislav Mikhailov
Maria Tikhonova
Tatiana Shavrina
55
0
0
29 Apr 2021
Teaching a Massive Open Online Course on Natural Language Processing
Ekaterina Artemova
M. Apishev
V. Sarkisyan
Sergey Aksenov
D. Kirjanov
O. Serikov
VLM
19
4
0
26 Apr 2021
Extract then Distill: Efficient and Effective Task-Agnostic BERT Distillation
Cheng Chen
Yichun Yin
Lifeng Shang
Zhi Wang
Xin Jiang
Xiao Chen
Qun Liu
FedML
78
7
0
24 Apr 2021
Learning to Learn to be Right for the Right Reasons
Pride Kavumba
Benjamin Heinzerling
Ana Brassard
Kentaro Inui
OOD
ReLM
LRM
49
4
0
23 Apr 2021
Transfer training from smaller language model
Han Zhang
54
0
0
23 Apr 2021
Improving BERT Pretraining with Syntactic Supervision
Georgios Tziafas
Konstantinos Kogkalidis
G. Wijnholds
M. Moortgat
76
4
0
21 Apr 2021
On the Impact of Word Error Rate on Acoustic-Linguistic Speech Emotion Recognition: An Update for the Deep Learning Era
Shahin Amiriparian
Artem Sokolov
Ilhan Aslan
Lukas Christ
Maurice Gerczuk
...
M. Milling
Sandra Ottl
Ilya Poduremennykh
E. Shuranov
Björn W. Schuller
66
17
0
20 Apr 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
368
2,550
0
20 Apr 2021
WASSA@IITK at WASSA 2021: Multi-task Learning and Transformer Finetuning for Emotion Classification and Empathy Prediction
Jay Mundra
Rohan Gupta
Sagnik Mukherjee
42
14
0
20 Apr 2021
Efficient pre-training objectives for Transformers
Luca Di Liello
Matteo Gabburo
Alessandro Moschitti
32
15
0
20 Apr 2021
NewsEdits: A Dataset of Revision Histories for News Articles (Technical Report: Data Processing)
Alexander Spangher
Jonathan May
KELM
22
4
0
19 Apr 2021
Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
Chenyi Lei
Shixian Luo
Yong Liu
Wanggui He
Jiamang Wang
Guoxin Wang
Haihong Tang
Chunyan Miao
Houqiang Li
60
42
0
19 Apr 2021
Consistent Accelerated Inference via Confident Adaptive Transformers
Tal Schuster
Adam Fisch
Tommi Jaakkola
Regina Barzilay
AI4TS
255
73
0
18 Apr 2021
SalKG: Learning From Knowledge Graph Explanations for Commonsense Reasoning
Aaron Chan
Lyne Tchapmi
Bo Long
Soumya Sanyal
Tanishq Gupta
Xiang Ren
ReLM
LRM
110
11
0
18 Apr 2021
A Simple and Effective Positional Encoding for Transformers
Pu-Chin Chen
Henry Tsai
Srinadh Bhojanapalli
Hyung Won Chung
Yin-Wen Chang
Chun-Sung Ferng
120
66
0
18 Apr 2021
Self-Supervised Pillar Motion Learning for Autonomous Driving
Chenxu Luo
Xiaodong Yang
Alan Yuille
SSL
3DPC
60
66
0
18 Apr 2021
Worst of Both Worlds: Biases Compound in Pre-trained Vision-and-Language Models
Tejas Srinivasan
Yonatan Bisk
VLM
83
56
0
18 Apr 2021
Competency Problems: On Finding and Removing Artifacts in Language Data
Matt Gardner
William Merrill
Jesse Dodge
Matthew E. Peters
Alexis Ross
Sameer Singh
Noah A. Smith
251
111
0
17 Apr 2021
Vision Transformer Pruning
Mingjian Zhu
Yehui Tang
Kai Han
ViT
95
92
0
17 Apr 2021
Neural Path Hunter: Reducing Hallucination in Dialogue Systems via Path Grounding
Nouha Dziri
Andrea Madotto
Osmar Zaiane
A. Bose
HILM
94
137
0
17 Apr 2021
Three-level Hierarchical Transformer Networks for Long-sequence and Multiple Clinical Documents Classification
Yuqi Si
Kirk Roberts
84
9
0
17 Apr 2021
A Graph-guided Multi-round Retrieval Method for Conversational Open-domain Question Answering
Chak Tou Leong
Wenjie Li
Liqiang Nie
RALM
37
10
0
17 Apr 2021
Capturing Row and Column Semantics in Transformer Based Question Answering over Tables
Michael R. Glass
Mustafa Canim
A. Gliozzo
Saneem A. Chemmengath
Vishwajeet Kumar
Rishav Chakravarti
Avirup Sil
FeiFei Pan
Samarth Bharadwaj
Nicolas Rodolfo Fauceglia
LMTD
90
54
0
16 Apr 2021
AMMU : A Survey of Transformer-based Biomedical Pretrained Language Models
Katikapalli Subramanyam Kalyan
A. Rajasekharan
S. Sangeetha
LM&MA
MedIm
115
170
0
16 Apr 2021
TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval
Ioana Croitoru
Simion-Vlad Bogolin
Marius Leordeanu
Hailin Jin
Andrew Zisserman
Samuel Albanie
Yang Liu
VGen
67
125
0
16 Apr 2021
Condenser: a Pre-training Architecture for Dense Retrieval
Luyu Gao
Jamie Callan
AI4CE
61
267
0
16 Apr 2021
Q
2
Q^{2}
Q
2
: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering
Or Honovich
Leshem Choshen
Roee Aharoni
Ella Neeman
Idan Szpektor
Omri Abend
HILM
104
143
0
16 Apr 2021
Back to Square One: Artifact Detection, Training and Commonsense Disentanglement in the Winograd Schema
Yanai Elazar
Hongming Zhang
Yoav Goldberg
Dan Roth
ReLM
LRM
128
44
0
16 Apr 2021
Gradient-based Adversarial Attacks against Text Transformers
Chuan Guo
Alexandre Sablayrolles
Hervé Jégou
Douwe Kiela
SILM
165
248
0
15 Apr 2021
Syntactic Perturbations Reveal Representational Correlates of Hierarchical Phrase Structure in Pretrained Language Models
Matteo Alleman
J. Mamou
Miguel Rio
Hanlin Tang
Yoon Kim
SueYeon Chung
NAI
100
17
0
15 Apr 2021
Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models
Karolina Stañczak
Sagnik Ray Choudhury
Tiago Pimentel
Ryan Cotterell
Isabelle Augenstein
84
24
0
15 Apr 2021
Unmasking the Mask -- Evaluating Social Biases in Masked Language Models
Masahiro Kaneko
Danushka Bollegala
66
72
0
15 Apr 2021
TransferNet: An Effective and Transparent Framework for Multi-hop Question Answering over Relation Graph
Jiaxin Shi
S. Cao
Lei Hou
Juan-Zi Li
Hanwang Zhang
GNN
90
112
0
15 Apr 2021
Text Guide: Improving the quality of long text classification by a text selection method based on feature importance
K. Fiok
W. Karwowski
Edgar Gutierrez-Franco
Mohammad Reza Davahli
Maciej Wilamowski
T. Ahram
Awad M. Aljuaid
Jozef Zurada
VLM
52
34
0
15 Apr 2021
Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese Pre-trained Language Models
Yuxuan Lai
Yijia Liu
Yansong Feng
Songfang Huang
Dongyan Zhao
VLM
AI4CE
70
38
0
15 Apr 2021
TWEAC: Transformer with Extendable QA Agent Classifiers
Gregor Geigle
Nils Reimers
Andreas Rucklé
Iryna Gurevych
ViT
148
27
0
14 Apr 2021
The Surprising Performance of Simple Baselines for Misinformation Detection
Kellin Pelrine
Jacob Danovitch
Reihaneh Rabbany
74
65
0
14 Apr 2021
I Wish I Would Have Loved This One, But I Didn't -- A Multilingual Dataset for Counterfactual Detection in Product Reviews
James OÑeill
Polina Rozenshtein
Ryuichi Kiryo
Motoko Kubota
Danushka Bollegala
76
32
0
14 Apr 2021
AR-LSAT: Investigating Analytical Reasoning of Text
Wanjun Zhong
Siyuan Wang
Duyu Tang
Zenan Xu
Daya Guo
Jiahai Wang
Jian Yin
Ming Zhou
Nan Duan
ELM
137
44
0
14 Apr 2021
Demystifying BERT: Implications for Accelerator Design
Suchita Pati
Shaizeen Aga
Nuwan Jayasena
Matthew D. Sinclair
LLMAG
88
17
0
14 Apr 2021
QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering
Michihiro Yasunaga
Hongyu Ren
Antoine Bosselut
Percy Liang
J. Leskovec
RALM
LMTD
AI4MH
LRM
92
599
0
13 Apr 2021
Lessons on Parameter Sharing across Layers in Transformers
Sho Takase
Shun Kiyono
105
87
0
13 Apr 2021
Restoring and Mining the Records of the Joseon Dynasty via Neural Language Modeling and Machine Translation
Kyeongpil Kang
Kyohoon Jin
Soyoung Yang
Show-Ling Jang
Jaegul Choo
Yougbin Kim
MU
112
18
0
13 Apr 2021
Semantic maps and metrics for science Semantic maps and metrics for science using deep transformer encoders
Brendan Chambers
James A. Evans
MedIm
46
0
0
13 Apr 2021
Discourse Probing of Pretrained Language Models
Fajri Koto
Jey Han Lau
Tim Baldwin
78
53
0
13 Apr 2021
Evaluating Pre-Trained Models for User Feedback Analysis in Software Engineering: A Study on Classification of App-Reviews
M. Hadi
Fatemeh H. Fard
61
33
0
12 Apr 2021
SpartQA: : A Textual Question Answering Benchmark for Spatial Reasoning
Roshanak Mirzaee
Hossein Rajaby Faghihi
Qiang Ning
Parisa Kordjmashidi
56
83
0
12 Apr 2021
Previous
1
2
3
...
43
44
45
...
57
58
59
Next