Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.11942
Cited By
v1
v2
v3
v4
v5
v6 (latest)
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Github (3271★)
Papers citing
"ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"
50 / 2,935 papers shown
Title
On the Interplay Between Fine-tuning and Sentence-level Probing for Linguistic Knowledge in Pre-trained Transformers
Marius Mosbach
A. Khokhlova
Michael A. Hedderich
Dietrich Klakow
54
46
0
06 Oct 2020
LEGAL-BERT: The Muppets straight out of Law School
Ilias Chalkidis
Manos Fergadiotis
Prodromos Malakasiotis
Nikolaos Aletras
Ion Androutsopoulos
AILaw
77
264
0
06 Oct 2020
GRUEN for Evaluating Linguistic Quality of Generated Text
Wanzheng Zhu
S. Bhat
122
61
0
06 Oct 2020
Pretrained Language Model Embryology: The Birth of ALBERT
Cheng-Han Chiang
Sung-Feng Huang
Hung-yi Lee
69
42
0
06 Oct 2020
KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation
Wenhu Chen
Yu-Chuan Su
Xifeng Yan
Wenjie Wang
VLM
139
22
0
05 Oct 2020
Pareto Probing: Trading Off Accuracy for Complexity
Tiago Pimentel
Naomi Saphra
Adina Williams
Ryan Cotterell
92
60
0
05 Oct 2020
Second-Order NLP Adversarial Examples
John X. Morris
AAML
50
0
0
05 Oct 2020
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention
Ikuya Yamada
Akari Asai
Hiroyuki Shindo
Hideaki Takeda
Yuji Matsumoto
135
676
0
02 Oct 2020
Which *BERT? A Survey Organizing Contextualized Encoders
Patrick Xia
Shijie Wu
Benjamin Van Durme
62
50
0
02 Oct 2020
Data Transfer Approaches to Improve Seq-to-Seq Retrosynthesis
Katsuhiko Ishiguro
K. Ujihara
R. Sawada
Hirotaka Akita
Masaaki Kotera
113
6
0
02 Oct 2020
Phonemer at WNUT-2020 Task 2: Sequence Classification Using COVID Twitter BERT and Bagging Ensemble Technique based on Plurality Voting
Anshul Wadhawan
43
7
0
01 Oct 2020
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models
Nikita Nangia
Clara Vania
Rasika Bhalerao
Samuel R. Bowman
161
690
0
30 Sep 2020
NatCat: Weakly Supervised Text Classification with Naturally Annotated Resources
Zewei Chu
K. Stratos
Kevin Gimpel
31
5
0
29 Sep 2020
Attention that does not Explain Away
Nan Ding
Xinjie Fan
Zhenzhong Lan
Dale Schuurmans
Radu Soricut
54
3
0
29 Sep 2020
Contrastive Distillation on Intermediate Representations for Language Model Compression
S. Sun
Zhe Gan
Yu Cheng
Yuwei Fang
Shuohang Wang
Jingjing Liu
VLM
78
73
0
29 Sep 2020
Improve Transformer Models with Better Relative Position Embeddings
Zhiheng Huang
Davis Liang
Peng Xu
Bing Xiang
ViT
79
132
0
28 Sep 2020
What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams
Di Jin
Eileen Pan
Nassim Oufattole
W. Weng
Hanyi Fang
Peter Szolovits
FaML
ELM
LM&MA
132
817
0
28 Sep 2020
TernaryBERT: Distillation-aware Ultra-low Bit BERT
Wei Zhang
Lu Hou
Yichun Yin
Lifeng Shang
Xiao Chen
Xin Jiang
Qun Liu
MQ
101
211
0
27 Sep 2020
Topic-Aware Multi-turn Dialogue Modeling
Yi Xu
Hai Zhao
Zhuosheng Zhang
91
79
0
26 Sep 2020
BET: A Backtranslation Approach for Easy Data Augmentation in Transformer-based Paraphrase Identification Context
Jean-Philippe Corbeil
Hadi Abdi Ghadivel
40
28
0
25 Sep 2020
Attention Meets Perturbations: Robust and Interpretable Attention with Adversarial Training
Shunsuke Kitada
Hitoshi Iyatomi
OOD
AAML
49
26
0
25 Sep 2020
AnchiBERT: A Pre-Trained Model for Ancient ChineseLanguage Understanding and Generation
Huishuang Tian
Kexin Yang
Dayiheng Liu
Jiancheng Lv
67
31
0
24 Sep 2020
Hierarchical Pre-training for Sequence Labelling in Spoken Dialog
E. Chapuis
Pierre Colombo
Matteo Manica
Matthieu Labeau
Chloé Clavel
170
59
0
23 Sep 2020
Hamming OCR: A Locality Sensitive Hashing Neural Network for Scene Text Recognition
Bing Li
Xin Tang
Xianbiao Qi
Yihao Chen
Rong Xiao
54
8
0
23 Sep 2020
Global-to-Local Neural Networks for Document-Level Relation Extraction
D. Wang
Wei Hu
E. Cao
Weijian Sun
NAI
90
122
0
22 Sep 2020
Preserving Integrity in Online Social Networks
A. Halevy
Cristian Canton Ferrer
Hao Ma
Umut Ozertem
Patrick Pantel
Marzieh Saeidi
Fabrizio Silvestri
Ves Stoyanov
72
59
0
22 Sep 2020
BioALBERT: A Simple and Effective Pre-trained Language Model for Biomedical Named Entity Recognition
Usman Naseem
Matloob Khushi
V. Reddy
S. Rajendran
Imran Razzak
Jinman Kim
58
63
0
19 Sep 2020
Learning to Attack: Towards Textual Adversarial Attacking in Real-world Situations
Yuan Zang
Bairu Hou
Fanchao Qi
Zhiyuan Liu
Xiaojun Meng
Maosong Sun
60
11
0
19 Sep 2020
The birth of Romanian BERT
Stefan Daniel Dumitrescu
Andrei-Marius Avram
S. Pyysalo
VLM
63
78
0
18 Sep 2020
A Multimodal Memes Classification: A Survey and Open Research Issues
Tariq Habib Afridi
A. Alam
Muhammad Numan Khan
Jawad Khan
Young-Koo Lee
55
41
0
17 Sep 2020
ISCAS at SemEval-2020 Task 5: Pre-trained Transformers for Counterfactual Statement Modeling
Yaojie Lu
Annan Li
Hongyu Lin
Xianpei Han
Le Sun
18
5
0
17 Sep 2020
Efficient Transformer-based Large Scale Language Representations using Hardware-friendly Block Structured Pruning
Bingbing Li
Zhenglun Kong
Tianyun Zhang
Ji Li
Zechao Li
Hang Liu
Caiwen Ding
VLM
192
65
0
17 Sep 2020
Answering Any-hop Open-domain Questions with Iterative Document Reranking
Ping Nie
Yuyu Zhang
Arun Ramamurthy
Le Song
70
20
0
16 Sep 2020
Question Directed Graph Attention Network for Numerical Reasoning over Text
Kunlong Chen
Weidi Xu
Xingyi Cheng
Zou Xiaochuan
Yuyu Zhang
Le Song
Taifeng Wang
Yuan Qi
Wei Chu
AIMat
OOD
85
67
0
16 Sep 2020
Multi-span Style Extraction for Generative Reading Comprehension
Junjie Yang
Zhuosheng Zhang
Hai Zhao
SyDa
51
14
0
15 Sep 2020
It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
Timo Schick
Hinrich Schütze
200
979
0
15 Sep 2020
MLMLM: Link Prediction with Mean Likelihood Masked Language Model
Louis Clouâtre
P. Trempe
Payel Das
Sarath Chandar
112
44
0
15 Sep 2020
Efficient Transformers: A Survey
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
227
1,133
0
14 Sep 2020
Cluster-Former: Clustering-based Sparse Transformer for Long-Range Dependency Encoding
Shuohang Wang
Luowei Zhou
Zhe Gan
Yen-Chun Chen
Yuwei Fang
S. Sun
Yu Cheng
Jingjing Liu
93
29
0
13 Sep 2020
DualDE: Dually Distilling Knowledge Graph Embedding for Faster and Cheaper Reasoning
Yushan Zhu
Wen Zhang
Yin Hua
Hui Chen
Xu-Xin Cheng
Wei Zhang
Huajun Chen Zhejiang University
89
28
0
13 Sep 2020
Syntax Role for Neural Semantic Role Labeling
Z. Li
Hai Zhao
Shexia He
Jiaxun Cai
NAI
65
19
0
12 Sep 2020
Compressed Deep Networks: Goodbye SVD, Hello Robust Low-Rank Approximation
M. Tukan
Alaa Maalouf
Matan Weksler
Dan Feldman
77
9
0
11 Sep 2020
UPB at SemEval-2020 Task 6: Pretrained Language Models for Definition Extraction
Andrei-Marius Avram
Dumitru-Clementin Cercel
Costin-Gabriel Chiru
31
7
0
11 Sep 2020
Semantic Relations and Deep Learning
Vivi Nastase
Stan Szpakowicz
GNN
41
0
0
11 Sep 2020
Rank over Class: The Untapped Potential of Ranking in Natural Language Processing
Amir Atapour-Abarghouei
Stephen Bonner
A. Mcgough
55
4
0
10 Sep 2020
Dialogue-adaptive Language Model Pre-training From Quality Estimation
Junlong Li
Zhuosheng Zhang
Hai Zhao
OffRL
68
12
0
10 Sep 2020
Modern Methods for Text Generation
Dimas Muñoz-Montesinos
24
5
0
10 Sep 2020
Learning Universal Representations from Word to Sentence
Yian Li
Hai Zhao
SSL
36
2
0
10 Sep 2020
Comparative Study of Language Models on Cross-Domain Data with Model Agnostic Explainability
Mayank Chhipa
Hrushikesh Mahesh Vazurkar
Abhijeet Kumar
Mridul Mishra
30
0
0
09 Sep 2020
ERNIE at SemEval-2020 Task 10: Learning Word Emphasis Selection by Pre-trained Language Model
Zhengjie Huang
Shikun Feng
Weiyue Su
Xuyi Chen
Shuohuan Wang
Jiaxiang Liu
Ouyang Xuan
Yu Sun
48
8
0
08 Sep 2020
Previous
1
2
3
...
51
52
53
...
57
58
59
Next