Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.11942
Cited By
v1
v2
v3
v4
v5
v6 (latest)
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Github (3271★)
Papers citing
"ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"
50 / 2,935 papers shown
Title
Optimizing Inference Performance of Transformers on CPUs
D. Dice
Alex Kogan
64
16
0
12 Feb 2021
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Jie Lei
Linjie Li
Luowei Zhou
Zhe Gan
Tamara L. Berg
Joey Tianyi Zhou
Jingjing Liu
CLIP
179
665
0
11 Feb 2021
Text Compression-aided Transformer Encoding
Z. Li
Zhuosheng Zhang
Hai Zhao
Rui Wang
Kehai Chen
Masao Utiyama
Eiichiro Sumita
AI4CE
71
45
0
11 Feb 2021
Customizing Contextualized Language Models forLegal Document Reviews
Shohreh Shaghaghian
Luna Feng
Feng
Borna Jafarpour
Nicolai Pogrebnyakov
AILaw
100
19
0
10 Feb 2021
Self-supervised learning for fast and scalable time series hyper-parameter tuning
Peiyi Zhang
Xiaodong Jiang
Ginger m Holt
N. Laptev
C. Komurlu
Peng Gao
Yang Yu
AI4TS
49
6
0
10 Feb 2021
Multi-turn Dialogue Reading Comprehension with Pivot Turns and Knowledge
Zhuosheng Zhang
Junlong Li
Hai Zhao
79
24
0
10 Feb 2021
User Engagement Prediction for Clarification in Search
Ivan Sekulić
Mohammad Aliannejadi
Fabio Crestani
58
25
0
08 Feb 2021
Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention
Yunyang Xiong
Zhanpeng Zeng
Rudrasis Chakraborty
Mingxing Tan
G. Fung
Yin Li
Vikas Singh
110
526
0
07 Feb 2021
Memory Augmented Sequential Paragraph Retrieval for Multi-hop Question Answering
Nan Shao
Yiming Cui
Ting Liu
Shijin Wang
Guoping Hu
KELM
50
5
0
07 Feb 2021
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Joey Tianyi Zhou
MLLM
392
546
0
04 Feb 2021
Learning to Select External Knowledge with Multi-Scale Negative Sampling
H. He
Hua Lu
Siqi Bao
Fan Wang
Hua Wu
Zhengyu Niu
Haifeng Wang
63
32
0
03 Feb 2021
AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning
Yuhan Liu
Saurabh Agarwal
Shivaram Venkataraman
OffRL
80
56
0
02 Feb 2021
Do Question Answering Modeling Improvements Hold Across Benchmarks?
Nelson F. Liu
Tony Lee
Robin Jia
Percy Liang
86
13
0
01 Feb 2021
Measuring and Improving Consistency in Pretrained Language Models
Yanai Elazar
Nora Kassner
Shauli Ravfogel
Abhilasha Ravichander
Eduard H. Hovy
Hinrich Schütze
Yoav Goldberg
HILM
337
371
0
01 Feb 2021
Scaling Federated Learning for Fine-tuning of Large Language Models
Agrin Hilmkil
Sebastian Callh
Matteo Barbieri
L. R. Sütfeld
Edvin Listo Zec
Olof Mogren
FedML
56
50
0
01 Feb 2021
Many Hands Make Light Work: Using Essay Traits to Automatically Score Essays
Rahul Kumar
Sandeep Albert Mathias
S. Saha
P. Bhattacharyya
70
30
0
01 Feb 2021
Speech Recognition by Simply Fine-tuning BERT
Wen-Chin Huang
Chia-Hua Wu
Shang-Bao Luo
Kuan-Yu Chen
Hsin-Min Wang
Tomoki Toda
117
28
0
30 Jan 2021
A transformer based approach for fighting COVID-19 fake news
S. M. S. Shifath
Mohammad Faiyaz Khan
Md. Saiful Islam
MedIm
63
23
0
28 Jan 2021
BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation
Jwala Dhamala
Tony Sun
Varun Kumar
Satyapriya Krishna
Yada Pruksachatkun
Kai-Wei Chang
Rahul Gupta
94
403
0
27 Jan 2021
KoreALBERT: Pretraining a Lite BERT Model for Korean Language Understanding
HyunJae Lee
Jaewoong Yoon
Bonggyu Hwang
Seongho Joe
Seungjai Min
Youngjune Gwon
SSeg
58
16
0
27 Jan 2021
Neural Sentence Ordering Based on Constraint Graphs
Yutao Zhu
Kun Zhou
J. Nie
Shengchao Liu
Zhicheng Dou
NAI
89
23
0
27 Jan 2021
A Case Study of Deep Learning Based Multi-Modal Methods for Predicting the Age-Suitability Rating of Movie Trailers
Mahsa Shafaei
C. Smailis
I. Kakadiaris
Thamar Solorio
401
1
0
26 Jan 2021
Evaluation of BERT and ALBERT Sentence Embedding Performance on Downstream NLP Tasks
Hyunjin Choi
Judong Kim
Seongho Joe
Youngjune Gwon
SSeg
79
104
0
26 Jan 2021
Stereotype and Skew: Quantifying Gender Bias in Pre-trained and Fine-tuned Language Models
Daniel de Vassimon Manela
D. Errington
Thomas Fisher
B. V. Breugel
Pasquale Minervini
54
96
0
24 Jan 2021
WangchanBERTa: Pretraining transformer-based Thai Language Models
Lalita Lowphansirikul
Charin Polpanumas
Nawat Jantrakulchai
Sarana Nutanong
56
76
0
24 Jan 2021
Debiasing Pre-trained Contextualised Embeddings
Masahiro Kaneko
Danushka Bollegala
269
143
0
23 Jan 2021
Training Multilingual Pre-trained Language Model with Byte-level Subwords
Junqiu Wei
Qun Liu
Yinpeng Guo
Xin Jiang
63
20
0
23 Jan 2021
Distilling Large Language Models into Tiny and Effective Students using pQRNN
P. Kaliamoorthi
Aditya Siddhant
Edward Li
Melvin Johnson
MQ
60
17
0
21 Jan 2021
PalmTree: Learning an Assembly Language Model for Instruction Embedding
Xuezixiang Li
Qu Yu
Heng Yin
79
155
0
21 Jan 2021
Adv-OLM: Generating Textual Adversaries via OLM
Vijit Malik
A. Bhat
Ashutosh Modi
134
6
0
21 Jan 2021
Towards Confident Machine Reading Comprehension
Rishav Chakravarti
Avirup Sil
73
4
0
20 Jan 2021
Automatic punctuation restoration with BERT models
A. Nagy
Bence Bial
Judit Ács
60
25
0
18 Jan 2021
Model Compression for Domain Adaptation through Causal Effect Estimation
Guy Rotman
Amir Feder
Roi Reichart
CML
92
7
0
18 Jan 2021
Red Alarm for Pre-trained Models: Universal Vulnerability to Neuron-Level Backdoor Attacks
Zhengyan Zhang
Guangxuan Xiao
Yongwei Li
Tian Lv
Fanchao Qi
Zhiyuan Liu
Yasheng Wang
Xin Jiang
Maosong Sun
AAML
153
74
0
18 Jan 2021
Efficiently Fusing Pretrained Acoustic and Linguistic Encoders for Low-resource Speech Recognition
Cheng Yi
Shiyu Zhou
Bo Xu
108
40
0
17 Jan 2021
Transformer-Based Models for Question Answering on COVID19
Hillary Ngai
Yoona Park
John Chen
Mahboobeh Parsapoor
OOD
48
21
0
16 Jan 2021
To Understand Representation of Layer-aware Sequence Encoders as Multi-order-graph
Sufeng Duan
Hai Zhao
MILM
64
0
0
16 Jan 2021
Grid Search Hyperparameter Benchmarking of BERT, ALBERT, and LongFormer on DuoRC
Alex John Quijano
Sam Nguyen
Juanita Ordoñez
55
7
0
15 Jan 2021
Hostility Detection and Covid-19 Fake News Detection in Social Media
Ayush Gupta
Rohan Sukumaran
Kevin John
Sundeep Teki
92
20
0
15 Jan 2021
KDLSQ-BERT: A Quantized Bert Combining Knowledge Distillation with Learned Step Size Quantization
Jing Jin
Cai Liang
Tiancheng Wu
Li Zou
Zhiliang Gan
MQ
59
27
0
15 Jan 2021
Transformer-based Language Model Fine-tuning Methods for COVID-19 Fake News Detection
Ben Chen
Bin Chen
D. Gao
Qijin Chen
Chengfu Huo
Xiaonan Meng
Weijun Ren
Yang Zhou
71
40
0
14 Jan 2021
Of Non-Linearity and Commutativity in BERT
Sumu Zhao
Damian Pascual
Gino Brunner
Roger Wattenhofer
103
17
0
12 Jan 2021
Model Generalization on COVID-19 Fake News Detection
Yejin Bang
Etsuko Ishii
Samuel Cahyawijaya
Ziwei Ji
Pascale Fung
121
37
0
11 Jan 2021
AT-BERT: Adversarial Training BERT for Acronym Identification Winning Solution for SDU@AAAI-21
Danqing Zhu
Wangli Lin
Yang Zhang
Qiwei Zhong
Guanxiong Zeng
Weilin Wu
Jiayu Tang
54
17
0
11 Jan 2021
BERT & Family Eat Word Salad: Experiments with Text Understanding
Ashim Gupta
Giorgi Kvernadze
Vivek Srikumar
260
73
0
10 Jan 2021
Political Depolarization of News Articles Using Attribute-aware Word Embeddings
Ruibo Liu
Lili Wang
Chenyan Jia
Soroush Vosoughi
69
21
0
05 Jan 2021
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
179
354
0
05 Jan 2021
Benchmarking Knowledge-Enhanced Commonsense Question Answering via Knowledge-to-Text Transformation
Ning Bian
Xianpei Han
Bo Chen
Le Sun
ELM
49
43
0
04 Jan 2021
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
Wangchunshu Zhou
Tao Ge
Canwen Xu
Ke Xu
Furu Wei
LRM
83
16
0
02 Jan 2021
Which Linguist Invented the Lightbulb? Presupposition Verification for Question-Answering
Najoung Kim
Ellie Pavlick
Burcu Karagol Ayan
Deepak Ramachandran
159
48
0
02 Jan 2021
Previous
1
2
3
...
46
47
48
...
57
58
59
Next