ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations
v1v2v3v4v5v6 (latest)

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSLAIMat
ArXiv (abs)PDFHTMLGithub (3271★)

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,935 papers shown
Title
Unveiling Transformers with LEGO: a synthetic reasoning task
Unveiling Transformers with LEGO: a synthetic reasoning task
Yi Zhang
A. Backurs
Sébastien Bubeck
Ronen Eldan
Suriya Gunasekar
Tal Wagner
LRM
138
91
0
09 Jun 2022
VN-Transformer: Rotation-Equivariant Attention for Vector Neurons
VN-Transformer: Rotation-Equivariant Attention for Vector Neurons
Serge Assaad
Carlton Downey
Rami Al-Rfou
Nigamaa Nayakanti
Benjamin Sapp
71
20
0
08 Jun 2022
Dual Decomposition of Convex Optimization Layers for Consistent
  Attention in Medical Images
Dual Decomposition of Convex Optimization Layers for Consistent Attention in Medical Images
Tom Ron
M. Weiler-Sagie
Tamir Hazan
FAttMedIm
81
6
0
06 Jun 2022
Learning Speaker-specific Lip-to-Speech Generation
Learning Speaker-specific Lip-to-Speech Generation
Munender Varshney
Ravindra Yadav
Vinay P. Namboodiri
R. Hegde
98
7
0
04 Jun 2022
ZeroQuant: Efficient and Affordable Post-Training Quantization for
  Large-Scale Transformers
ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
Z. Yao
Reza Yazdani Aminabadi
Minjia Zhang
Xiaoxia Wu
Conglong Li
Yuxiong He
VLMMQ
177
484
0
04 Jun 2022
Extreme Compression for Pre-trained Transformers Made Simple and
  Efficient
Extreme Compression for Pre-trained Transformers Made Simple and Efficient
Xiaoxia Wu
Z. Yao
Minjia Zhang
Conglong Li
Yuxiong He
MQ
68
31
0
04 Jun 2022
Kallima: A Clean-label Framework for Textual Backdoor Attacks
Kallima: A Clean-label Framework for Textual Backdoor Attacks
Xiaoyi Chen
Yinpeng Dong
Zeyu Sun
Shengfang Zhai
Qingni Shen
Zhonghai Wu
AAML
49
32
0
03 Jun 2022
EMS: Efficient and Effective Massively Multilingual Sentence Embedding
  Learning
EMS: Efficient and Effective Massively Multilingual Sentence Embedding Learning
Zhuoyuan Mao
Chenhui Chu
Sadao Kurohashi
80
1
0
31 May 2022
Analysis of Augmentations for Contrastive ECG Representation Learning
Analysis of Augmentations for Contrastive ECG Representation Learning
S. Soltanieh
Ali Etemad
J. Hashemi
SSL
57
19
0
30 May 2022
VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
Wangchunshu Zhou
Yan Zeng
Shizhe Diao
Xinsong Zhang
CoGeVLM
97
13
0
30 May 2022
A Survey in Mathematical Language Processing
A Survey in Mathematical Language Processing
Jordan Meadows
André Freitas
AIMat
63
16
0
30 May 2022
From Representation to Reasoning: Towards both Evidence and Commonsense
  Reasoning for Video Question-Answering
From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering
Jiangtong Li
Li Niu
Liqing Zhang
67
53
0
30 May 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with
  IO-Awareness
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
439
2,297
0
27 May 2022
A Survey on Long-Tailed Visual Recognition
A Survey on Long-Tailed Visual Recognition
Lu Yang
He Jiang
Q. Song
Jun Guo
93
135
0
27 May 2022
Self-supervised Pretraining and Transfer Learning Enable Flu and
  COVID-19 Predictions in Small Mobile Sensing Datasets
Self-supervised Pretraining and Transfer Learning Enable Flu and COVID-19 Predictions in Small Mobile Sensing Datasets
Michael Merrill
Tim Althoff
AI4TS
84
14
0
26 May 2022
Jointly Learning Span Extraction and Sequence Labeling for Information
  Extraction from Business Documents
Jointly Learning Span Extraction and Sequence Labeling for Information Extraction from Business Documents
Nguyen Hong Son
Hieu M. Vu
Tuan-Anh Dang Nguyen
Minh Le Nguyen
72
6
0
26 May 2022
Transcormer: Transformer for Sentence Scoring with Sliding Language
  Modeling
Transcormer: Transformer for Sentence Scoring with Sliding Language Modeling
Kaitao Song
Yichong Leng
Xu Tan
Yicheng Zou
Tao Qin
Dongsheng Li
107
11
0
25 May 2022
Contrastive Learning with Boosted Memorization
Contrastive Learning with Boosted Memorization
Zhihan Zhou
Jiangchao Yao
Yanfeng Wang
Bo Han
Ya Zhang
SSL
109
31
0
25 May 2022
Rethinking Fano's Inequality in Ensemble Learning
Rethinking Fano's Inequality in Ensemble Learning
Terufumi Morishita
Gaku Morio
Shota Horiguchi
Hiroaki Ozaki
N. Nukaga
FedML
35
3
0
25 May 2022
Evaluating the Diversity, Equity and Inclusion of NLP Technology: A Case
  Study for Indian Languages
Evaluating the Diversity, Equity and Inclusion of NLP Technology: A Case Study for Indian Languages
Simran Khanuja
Sebastian Ruder
Partha P. Talukdar
130
20
0
25 May 2022
TAGPRIME: A Unified Framework for Relational Structure Extraction
TAGPRIME: A Unified Framework for Relational Structure Extraction
I-Hung Hsu
Kuan-Hao Huang
Shuning Zhang
Wen-Huang Cheng
Premkumar Natarajan
Kai-Wei Chang
Nanyun Peng
64
14
0
25 May 2022
VulBERTa: Simplified Source Code Pre-Training for Vulnerability
  Detection
VulBERTa: Simplified Source Code Pre-Training for Vulnerability Detection
Hazim Hanif
S. Maffeis
129
113
0
25 May 2022
Toward Understanding Bias Correlations for Mitigation in NLP
Toward Understanding Bias Correlations for Mitigation in NLP
Lu Cheng
Suyu Ge
Huan Liu
72
9
0
24 May 2022
When More Data Hurts: A Troubling Quirk in Developing Broad-Coverage
  Natural Language Understanding Systems
When More Data Hurts: A Troubling Quirk in Developing Broad-Coverage Natural Language Understanding Systems
Elias Stengel-Eskin
Emmanouil Antonios Platanios
Adam Pauls
Sam Thomson
Hao Fang
Benjamin Van Durme
J. Eisner
Yu-Chuan Su
73
2
0
24 May 2022
From Easy to Hard: Two-stage Selector and Reader for Multi-hop Question
  Answering
From Easy to Hard: Two-stage Selector and Reader for Multi-hop Question Answering
Xin-Yi Li
Weixian Lei
Yubin Yang
RALM
143
23
0
24 May 2022
SelfReformer: Self-Refined Network with Transformer for Salient Object
  Detection
SelfReformer: Self-Refined Network with Transformer for Salient Object Detection
Y. Yun
Weisi Lin
ViT
124
29
0
23 May 2022
UnifieR: A Unified Retriever for Large-Scale Retrieval
UnifieR: A Unified Retriever for Large-Scale Retrieval
Tao Shen
Xiubo Geng
Chongyang Tao
Can Xu
Guodong Long
Kai Zhang
Daxin Jiang
RALM
72
29
0
23 May 2022
Prompt Tuning for Discriminative Pre-trained Language Models
Prompt Tuning for Discriminative Pre-trained Language Models
Yuan Yao
Bowen Dong
Ao Zhang
Zhengyan Zhang
Ruobing Xie
Zhiyuan Liu
Leyu Lin
Maosong Sun
Jianyong Wang
VLM
80
34
0
23 May 2022
Artificial intelligence for topic modelling in Hindu philosophy: mapping
  themes between the Upanishads and the Bhagavad Gita
Artificial intelligence for topic modelling in Hindu philosophy: mapping themes between the Upanishads and the Bhagavad Gita
Rohitash Chandra
Mukul Ranjan
AI4CE
58
13
0
23 May 2022
Life after BERT: What do Other Muppets Understand about Language?
Life after BERT: What do Other Muppets Understand about Language?
Vladislav Lialin
Kevin Zhao
Namrata Shivagunde
Anna Rumshisky
110
6
0
21 May 2022
Revisiting Pre-trained Language Models and their Evaluation for Arabic
  Natural Language Understanding
Revisiting Pre-trained Language Models and their Evaluation for Arabic Natural Language Understanding
Abbas Ghaddar
Yimeng Wu
Sunyam Bagga
Ahmad Rashid
Khalil Bibi
...
Zhefeng Wang
Baoxing Huai
Xin Jiang
Qun Liu
Philippe Langlais
65
7
0
21 May 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSLAI4TS
287
368
0
21 May 2022
Calibration of Natural Language Understanding Models with Venn--ABERS
  Predictors
Calibration of Natural Language Understanding Models with Venn--ABERS Predictors
Patrizio Giovannotti
126
7
0
21 May 2022
Improvements to Self-Supervised Representation Learning for Masked Image
  Modeling
Improvements to Self-Supervised Representation Learning for Masked Image Modeling
Jia-ju Mao
Xuesong Yin
Yuan Chang
Honggu Zhou
SSL
47
1
0
21 May 2022
Pre-training Transformer Models with Sentence-Level Objectives for
  Answer Sentence Selection
Pre-training Transformer Models with Sentence-Level Objectives for Answer Sentence Selection
Luca Di Liello
Siddhant Garg
Luca Soldaini
Alessandro Moschitti
71
17
0
20 May 2022
Visually-Augmented Language Modeling
Visually-Augmented Language Modeling
Weizhi Wang
Li Dong
Hao Cheng
Haoyu Song
Xiaodong Liu
Xifeng Yan
Jianfeng Gao
Furu Wei
VLM
89
18
0
20 May 2022
Beyond the Granularity: Multi-Perspective Dialogue Collaborative
  Selection for Dialogue State Tracking
Beyond the Granularity: Multi-Perspective Dialogue Collaborative Selection for Dialogue State Tracking
Jinyu Guo
Kai Shuang
Jijie Li
Zihan Wang
Yixuan Liu
60
17
0
20 May 2022
Exploring Extreme Parameter Compression for Pre-trained Language Models
Exploring Extreme Parameter Compression for Pre-trained Language Models
Yuxin Ren
Benyou Wang
Lifeng Shang
Xin Jiang
Qun Liu
80
19
0
20 May 2022
KERPLE: Kernelized Relative Positional Embedding for Length
  Extrapolation
KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Ta-Chung Chi
Ting-Han Fan
Peter J. Ramadge
Alexander I. Rudnicky
98
73
0
20 May 2022
Can Foundation Models Wrangle Your Data?
Can Foundation Models Wrangle Your Data?
A. Narayan
Ines Chami
Laurel J. Orr
Simran Arora
Christopher Ré
LMTDAI4CE
239
231
0
20 May 2022
MiDAS: Multi-integrated Domain Adaptive Supervision for Fake News
  Detection
MiDAS: Multi-integrated Domain Adaptive Supervision for Fake News Detection
Abhijit Suprem
C. Pu
115
7
0
19 May 2022
RankGen: Improving Text Generation with Large Ranking Models
RankGen: Improving Text Generation with Large Ranking Models
Kalpesh Krishna
Yapei Chang
John Wieting
Mohit Iyyer
AIMat
79
69
0
19 May 2022
Two-Step Question Retrieval for Open-Domain QA
Two-Step Question Retrieval for Open-Domain QA
Yeon Seonwoo
Juhee Son
Jiho Jin
Sang-Woo Lee
Ji-Hoon Kim
Jung-Woo Ha
Alice Oh
RALMLRM
56
5
0
19 May 2022
Transformers as Neural Augmentors: Class Conditional Sentence Generation
  via Variational Bayes
Transformers as Neural Augmentors: Class Conditional Sentence Generation via Variational Bayes
M. Bilici
M. Amasyalı
ViT
59
2
0
19 May 2022
ERNIE-Search: Bridging Cross-Encoder with Dual-Encoder via Self
  On-the-fly Distillation for Dense Passage Retrieval
ERNIE-Search: Bridging Cross-Encoder with Dual-Encoder via Self On-the-fly Distillation for Dense Passage Retrieval
Yuxiang Lu
Yiding Liu
Jiaxiang Liu
Yunsheng Shi
Zhengjie Huang
...
Hao Tian
Hua Wu
Shuaiqiang Wang
D. Yin
Haifeng Wang
166
60
0
18 May 2022
Regex in a Time of Deep Learning: The Role of an Old Technology in Age
  Discrimination Detection in Job Advertisements
Regex in a Time of Deep Learning: The Role of an Old Technology in Age Discrimination Detection in Job Advertisements
Ann Pillar
Kyrill Poelmans
Martha Larson
13
4
0
18 May 2022
LogiGAN: Learning Logical Reasoning via Adversarial Pre-training
LogiGAN: Learning Logical Reasoning via Adversarial Pre-training
Xinyu Pi
Wanjun Zhong
Yan Gao
Nan Duan
Jian-Guang Lou
NAIGANLRMAI4CE
87
16
0
18 May 2022
PASH at TREC 2021 Deep Learning Track: Generative Enhanced Model for
  Multi-stage Ranking
PASH at TREC 2021 Deep Learning Track: Generative Enhanced Model for Multi-stage Ranking
Yixuan Qiao
Hao Chen
Jun Wang
Yongquan Lai
Tuozhen Liu
...
Xin Tang
Rui Fang
Peng Gao
Wenfeng Xie
Guotong Xie
51
1
0
18 May 2022
An Evaluation Framework for Legal Document Summarization
An Evaluation Framework for Legal Document Summarization
Ankan Mullick
Abhilash Nandy
M. Kapadnis
Sohan Patnaik
R. Raghav
Roshni Kar
AILawELM
18
7
0
17 May 2022
A Fast Attention Network for Joint Intent Detection and Slot Filling on
  Edge Devices
A Fast Attention Network for Joint Intent Detection and Slot Filling on Edge Devices
Liang Huang
Senjie Liang
Feiyang Ye
Nan Gao
93
4
0
16 May 2022
Previous
123...282930...575859
Next