ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSL
    AIMat
ArXivPDFHTML

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,913 papers shown
Title
KLUE: Korean Language Understanding Evaluation
KLUE: Korean Language Understanding Evaluation
Sungjoon Park
Jihyung Moon
Sungdong Kim
Won Ik Cho
Jiyoon Han
...
Seonghyun Kim
Lucy Park
Alice Oh
Jung-Woo Ha
Kyunghyun Cho
ELM
VLM
29
193
0
20 May 2021
Towards Detecting Need for Empathetic Response in Motivational
  Interviewing
Towards Detecting Need for Empathetic Response in Motivational Interviewing
Zixiu "Alex" Wu
Rim Helaoui
Vivek Kumar (Ph.D)
Diego Reforgiato Recupero
Daniele Riboni
16
14
0
20 May 2021
Self-supervised Heterogeneous Graph Neural Network with Co-contrastive
  Learning
Self-supervised Heterogeneous Graph Neural Network with Co-contrastive Learning
Xiao Wang
Nian Liu
Hui-jun Han
C. Shi
SSL
25
376
0
19 May 2021
Investigating Math Word Problems using Pretrained Multilingual Language
  Models
Investigating Math Word Problems using Pretrained Multilingual Language Models
Minghuan Tan
Lei Wang
Lingxiao Jiang
Jing Jiang
LRM
27
33
0
19 May 2021
Relative Positional Encoding for Transformers with Linear Complexity
Relative Positional Encoding for Transformers with Linear Complexity
Antoine Liutkus
Ondřej Cífka
Shih-Lun Wu
Umut Simsekli
Yi-Hsuan Yang
Gaël Richard
38
46
0
18 May 2021
SHARE: a System for Hierarchical Assistive Recipe Editing
SHARE: a System for Hierarchical Assistive Recipe Editing
Shuyang Li
Yufei Li
Jianmo Ni
Julian McAuley
24
19
0
17 May 2021
SeaD: End-to-end Text-to-SQL Generation with Schema-aware Denoising
SeaD: End-to-end Text-to-SQL Generation with Schema-aware Denoising
K. Xuan
Yongbo Wang
Yongliang Wang
Zujie Wen
Yang Dong
VLM
38
52
0
17 May 2021
Self-supervised Learning on Graphs: Contrastive, Generative,or
  Predictive
Self-supervised Learning on Graphs: Contrastive, Generative,or Predictive
Lirong Wu
Haitao Lin
Zhangyang Gao
Cheng Tan
Stan.Z.Li
SSL
35
243
0
16 May 2021
A Deep Metric Learning Approach to Account Linking
A Deep Metric Learning Approach to Account Linking
Aleem Khan
Elizabeth Fleming
N. Schofield
M. Bishop
Nicholas Andrews
24
21
0
15 May 2021
On the Distributional Properties of Adaptive Gradients
On the Distributional Properties of Adaptive Gradients
Z. Zhiyi
Liu Ziyin
20
4
0
15 May 2021
Distilling BERT for low complexity network training
Distilling BERT for low complexity network training
Bansidhar Mangalwedhekar
24
1
0
13 May 2021
Addressing "Documentation Debt" in Machine Learning Research: A
  Retrospective Datasheet for BookCorpus
Addressing "Documentation Debt" in Machine Learning Research: A Retrospective Datasheet for BookCorpus
Jack Bandy
Nicholas Vincent
29
57
0
11 May 2021
Benchmarking down-scaled (not so large) pre-trained language models
Benchmarking down-scaled (not so large) pre-trained language models
Matthias Aßenmacher
P. Schulze
C. Heumann
14
1
0
11 May 2021
Improving Factual Consistency of Abstractive Summarization via Question
  Answering
Improving Factual Consistency of Abstractive Summarization via Question Answering
Feng Nan
Cicero Nogueira dos Santos
Henghui Zhu
Patrick Ng
Kathleen McKeown
Ramesh Nallapati
Dejiao Zhang
Zhiguo Wang
Andrew O. Arnold
Bing Xiang
HILM
14
82
0
10 May 2021
Dispatcher: A Message-Passing Approach To Language Modelling
Dispatcher: A Message-Passing Approach To Language Modelling
A. Cetoli
45
0
0
09 May 2021
Which transformer architecture fits my data? A vocabulary bottleneck in
  self-attention
Which transformer architecture fits my data? A vocabulary bottleneck in self-attention
Noam Wies
Yoav Levine
Daniel Jannai
Amnon Shashua
40
20
0
09 May 2021
Self-Supervised Adversarial Example Detection by Disentangled
  Representation
Self-Supervised Adversarial Example Detection by Disentangled Representation
Zhaoxi Zhang
L. Zhang
Xufei Zheng
Jinyu Tian
Jiantao Zhou
AAML
DRL
29
8
0
08 May 2021
Logic-Driven Context Extension and Data Augmentation for Logical
  Reasoning of Text
Logic-Driven Context Extension and Data Augmentation for Logical Reasoning of Text
Siyuan Wang
Wanjun Zhong
Duyu Tang
Zhongyu Wei
Zhihao Fan
Daxin Jiang
Ming Zhou
Nan Duan
NAI
36
70
0
08 May 2021
Empirical Evaluation of Pre-trained Transformers for Human-Level NLP:
  The Role of Sample Size and Dimensionality
Empirical Evaluation of Pre-trained Transformers for Human-Level NLP: The Role of Sample Size and Dimensionality
Adithya Ganesan
Matthew Matero
Aravind Reddy Ravula
Huy-Hien Vu
H. Andrew Schwartz
30
35
0
07 May 2021
Adapting by Pruning: A Case Study on BERT
Adapting by Pruning: A Case Study on BERT
Yang Gao
Nicolo Colombo
Wen Wang
29
17
0
07 May 2021
Are Pre-trained Convolutions Better than Pre-trained Transformers?
Are Pre-trained Convolutions Better than Pre-trained Transformers?
Yi Tay
Mostafa Dehghani
J. Gupta
Dara Bahri
V. Aribandi
Zhen Qin
Donald Metzler
AI4CE
25
48
0
07 May 2021
VAULT: VAriable Unified Long Text Representation for Machine Reading
  Comprehension
VAULT: VAriable Unified Long Text Representation for Machine Reading Comprehension
Haoyang Wen
Anthony Ferritto
Heng Ji
Radu Florian
Avirup Sil
21
3
0
07 May 2021
Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing
  Regressions In NLP Model Updates
Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates
Yuqing Xie
Yi-An Lai
Yuanjun Xiong
Yi Zhang
Stefano Soatto
UQCV
24
16
0
07 May 2021
Do language models learn typicality judgments from text?
Do language models learn typicality judgments from text?
Kanishka Misra
Allyson Ettinger
Julia Taylor Rayz
11
33
0
06 May 2021
HerBERT: Efficiently Pretrained Transformer-based Language Model for
  Polish
HerBERT: Efficiently Pretrained Transformer-based Language Model for Polish
Robert Mroczkowski
Piotr Rybak
Alina Wróblewska
Ireneusz Gawlik
36
81
0
04 May 2021
An Estimation of Online Video User Engagement from Features of
  Continuous Emotions
An Estimation of Online Video User Engagement from Features of Continuous Emotions
Lukas Stappen
Alice Baird
Michelle Lienhart
Annalena Batz
Björn Schuller
33
3
0
04 May 2021
When to Foldém: How to answer Unanswerable questions
When to Foldém: How to answer Unanswerable questions
Marshall Ho
Zhipeng Zhou
J. He
36
2
0
01 May 2021
Adversarial Example Detection for DNN Models: A Review and Experimental
  Comparison
Adversarial Example Detection for DNN Models: A Review and Experimental Comparison
Ahmed Aldahdooh
W. Hamidouche
Sid Ahmed Fezza
Olivier Déforges
AAML
24
122
0
01 May 2021
Using Transformers to Provide Teachers with Personalized Feedback on
  their Classroom Discourse: The TalkMoves Application
Using Transformers to Provide Teachers with Personalized Feedback on their Classroom Discourse: The TalkMoves Application
Abhijit Suresh
Jennifer Jacobs
Vivian Lai
Chenhao Tan
Wayne H. Ward
James H. Martin
T. Sumner
22
29
0
29 Apr 2021
MOROCCO: Model Resource Comparison Framework
MOROCCO: Model Resource Comparison Framework
Valentin Malykh
Alexander Kukushkin
Ekaterina Artemova
Vladislav Mikhailov
Maria Tikhonova
Tatiana Shavrina
24
0
0
29 Apr 2021
Teaching a Massive Open Online Course on Natural Language Processing
Teaching a Massive Open Online Course on Natural Language Processing
Ekaterina Artemova
M. Apishev
V. Sarkisyan
Sergey Aksenov
D. Kirjanov
O. Serikov
VLM
19
4
0
26 Apr 2021
Extract then Distill: Efficient and Effective Task-Agnostic BERT
  Distillation
Extract then Distill: Efficient and Effective Task-Agnostic BERT Distillation
Cheng Chen
Yichun Yin
Lifeng Shang
Zhi Wang
Xin Jiang
Xiao Chen
Qun Liu
FedML
33
7
0
24 Apr 2021
Learning to Learn to be Right for the Right Reasons
Learning to Learn to be Right for the Right Reasons
Pride Kavumba
Benjamin Heinzerling
Ana Brassard
Kentaro Inui
OOD
ReLM
LRM
33
3
0
23 Apr 2021
Transfer training from smaller language model
Transfer training from smaller language model
Han Zhang
46
0
0
23 Apr 2021
Improving BERT Pretraining with Syntactic Supervision
Improving BERT Pretraining with Syntactic Supervision
Georgios Tziafas
Konstantinos Kogkalidis
G. Wijnholds
M. Moortgat
43
3
0
21 Apr 2021
On the Impact of Word Error Rate on Acoustic-Linguistic Speech Emotion
  Recognition: An Update for the Deep Learning Era
On the Impact of Word Error Rate on Acoustic-Linguistic Speech Emotion Recognition: An Update for the Deep Learning Era
Shahin Amiriparian
Artem Sokolov
Ilhan Aslan
Lukas Christ
Maurice Gerczuk
...
M. Milling
Sandra Ottl
Ilya Poduremennykh
E. Shuranov
Björn W. Schuller
33
17
0
20 Apr 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
46
2,224
0
20 Apr 2021
WASSA@IITK at WASSA 2021: Multi-task Learning and Transformer Finetuning
  for Emotion Classification and Empathy Prediction
WASSA@IITK at WASSA 2021: Multi-task Learning and Transformer Finetuning for Emotion Classification and Empathy Prediction
Jay Mundra
Rohan Gupta
Sagnik Mukherjee
13
14
0
20 Apr 2021
Efficient pre-training objectives for Transformers
Efficient pre-training objectives for Transformers
Luca Di Liello
Matteo Gabburo
Alessandro Moschitti
8
15
0
20 Apr 2021
NewsEdits: A Dataset of Revision Histories for News Articles (Technical
  Report: Data Processing)
NewsEdits: A Dataset of Revision Histories for News Articles (Technical Report: Data Processing)
Alexander Spangher
Jonathan May
KELM
12
3
0
19 Apr 2021
Understanding Chinese Video and Language via Contrastive Multimodal
  Pre-Training
Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
Chenyi Lei
Shixian Luo
Yong Liu
Wanggui He
Jiamang Wang
Guoxin Wang
Haihong Tang
Chunyan Miao
Houqiang Li
30
41
0
19 Apr 2021
Consistent Accelerated Inference via Confident Adaptive Transformers
Consistent Accelerated Inference via Confident Adaptive Transformers
Tal Schuster
Adam Fisch
Tommi Jaakkola
Regina Barzilay
AI4TS
203
69
0
18 Apr 2021
SalKG: Learning From Knowledge Graph Explanations for Commonsense
  Reasoning
SalKG: Learning From Knowledge Graph Explanations for Commonsense Reasoning
Aaron Chan
Lyne Tchapmi
Bo Long
Soumya Sanyal
Tanishq Gupta
Xiang Ren
ReLM
LRM
32
11
0
18 Apr 2021
A Simple and Effective Positional Encoding for Transformers
A Simple and Effective Positional Encoding for Transformers
Pu-Chin Chen
Henry Tsai
Srinadh Bhojanapalli
Hyung Won Chung
Yin-Wen Chang
Chun-Sung Ferng
61
62
0
18 Apr 2021
Self-Supervised Pillar Motion Learning for Autonomous Driving
Self-Supervised Pillar Motion Learning for Autonomous Driving
Chenxu Luo
Xiaodong Yang
Alan Yuille
SSL
3DPC
33
66
0
18 Apr 2021
Worst of Both Worlds: Biases Compound in Pre-trained Vision-and-Language
  Models
Worst of Both Worlds: Biases Compound in Pre-trained Vision-and-Language Models
Tejas Srinivasan
Yonatan Bisk
VLM
32
56
0
18 Apr 2021
Competency Problems: On Finding and Removing Artifacts in Language Data
Competency Problems: On Finding and Removing Artifacts in Language Data
Matt Gardner
William Merrill
Jesse Dodge
Matthew E. Peters
Alexis Ross
Sameer Singh
Noah A. Smith
173
107
0
17 Apr 2021
Vision Transformer Pruning
Vision Transformer Pruning
Mingjian Zhu
Yehui Tang
Kai Han
ViT
19
90
0
17 Apr 2021
Neural Path Hunter: Reducing Hallucination in Dialogue Systems via Path
  Grounding
Neural Path Hunter: Reducing Hallucination in Dialogue Systems via Path Grounding
Nouha Dziri
Andrea Madotto
Osmar Zaiane
A. Bose
HILM
28
132
0
17 Apr 2021
Three-level Hierarchical Transformer Networks for Long-sequence and
  Multiple Clinical Documents Classification
Three-level Hierarchical Transformer Networks for Long-sequence and Multiple Clinical Documents Classification
Yuqi Si
Kirk Roberts
27
9
0
17 Apr 2021
Previous
123...424344...575859
Next