ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations
v1v2v3v4v5v6 (latest)

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSLAIMat
ArXiv (abs)PDFHTMLGithub (3271★)

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,935 papers shown
Title
Robustly Optimized and Distilled Training for Natural Language
  Understanding
Robustly Optimized and Distilled Training for Natural Language Understanding
Haytham ElFadeel
Stanislav Peshterliev
VLMOffRL
37
1
0
16 Mar 2021
LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time
  Image-Text Retrieval
LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval
Siqi Sun
Yen-Chun Chen
Linjie Li
Shuohang Wang
Yuwei Fang
Jingjing Liu
VLM
89
84
0
16 Mar 2021
How Many Data Points is a Prompt Worth?
How Many Data Points is a Prompt Worth?
Teven Le Scao
Alexander M. Rush
VLM
205
303
0
15 Mar 2021
SemVLP: Vision-Language Pre-training by Aligning Semantics at Multiple
  Levels
SemVLP: Vision-Language Pre-training by Aligning Semantics at Multiple Levels
Chenliang Li
Ming Yan
Haiyang Xu
Fuli Luo
Wei Wang
Bin Bi
Songfang Huang
VLM
74
36
0
14 Mar 2021
Text Mining of Stocktwits Data for Predicting Stock Prices
Text Mining of Stocktwits Data for Predicting Stock Prices
Mukul Jaggi
Priyanka Mandal
Shreya Narang
Usman Naseem
Matloob Khushi
AIFin
73
41
0
13 Mar 2021
CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review
CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review
Dan Hendrycks
Collin Burns
Anya Chen
Spencer Ball
ELMAILaw
81
195
0
10 Mar 2021
Team Phoenix at WASSA 2021: Emotion Analysis on News Stories with
  Pre-Trained Language Models
Team Phoenix at WASSA 2021: Emotion Analysis on News Stories with Pre-Trained Language Models
Yash Butala
Kanishk Singh
Adarsh Kumar
Shrey Shrivastava
57
10
0
10 Mar 2021
Wav2vec-C: A Self-supervised Model for Speech Representation Learning
Wav2vec-C: A Self-supervised Model for Speech Representation Learning
Samik Sadhu
Di He
Che-Wei Huang
Sri Harish Reddy Mallidi
Minhua Wu
Ariya Rastrow
A. Stolcke
J. Droppo
Roland Maas
SSL
68
49
0
09 Mar 2021
Beyond Nyströmformer -- Approximation of self-attention by Spectral
  Shifting
Beyond Nyströmformer -- Approximation of self-attention by Spectral Shifting
Madhusudan Verma
48
1
0
09 Mar 2021
BERTese: Learning to Speak to BERT
BERTese: Learning to Speak to BERT
Adi Haviv
Jonathan Berant
Amir Globerson
123
124
0
09 Mar 2021
Self-supervised Regularization for Text Classification
Self-supervised Regularization for Text Classification
Meng Zhou
Zechen Li
P. Xie
60
16
0
09 Mar 2021
Improving Document-Level Sentiment Classification Using Importance of
  Sentences
Improving Document-Level Sentiment Classification Using Importance of Sentences
Gihyeon Choi
Shinhyeok Oh
H. Kim
61
27
0
09 Mar 2021
MCR-Net: A Multi-Step Co-Interactive Relation Network for Unanswerable
  Questions on Machine Reading Comprehension
MCR-Net: A Multi-Step Co-Interactive Relation Network for Unanswerable Questions on Machine Reading Comprehension
Wei Peng
Yue Hu
Jiahao Yu
Luxi Xing
Yuqiang Xie
Zihao Zhu
Yajing Sun
53
2
0
08 Mar 2021
Split Computing and Early Exiting for Deep Learning Applications: Survey
  and Research Challenges
Split Computing and Early Exiting for Deep Learning Applications: Survey and Research Challenges
Yoshitomo Matsubara
Marco Levorato
Francesco Restuccia
120
215
0
08 Mar 2021
Pufferfish: Communication-efficient Models At No Extra Cost
Pufferfish: Communication-efficient Models At No Extra Cost
Hongyi Wang
Saurabh Agarwal
Dimitris Papailiopoulos
85
59
0
05 Mar 2021
Rissanen Data Analysis: Examining Dataset Characteristics via
  Description Length
Rissanen Data Analysis: Examining Dataset Characteristics via Description Length
Ethan Perez
Douwe Kiela
Kyunghyun Cho
82
24
0
05 Mar 2021
Attention is Not All You Need: Pure Attention Loses Rank Doubly
  Exponentially with Depth
Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth
Yihe Dong
Jean-Baptiste Cordonnier
Andreas Loukas
163
388
0
05 Mar 2021
Moshpit SGD: Communication-Efficient Decentralized Training on
  Heterogeneous Unreliable Devices
Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices
Max Ryabinin
Eduard A. Gorbunov
Vsevolod Plokhotnyuk
Gennady Pekhimenko
133
35
0
04 Mar 2021
Perceiver: General Perception with Iterative Attention
Perceiver: General Perception with Iterative Attention
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLMViTMDE
214
1,029
0
04 Mar 2021
Weakly-Supervised Open-Retrieval Conversational Question Answering
Weakly-Supervised Open-Retrieval Conversational Question Answering
Chen Qu
Liu Yang
Cen Chen
W. Bruce Croft
Kalpesh Krishna
Mohit Iyyer
RALM
73
13
0
03 Mar 2021
Disentangling Syntax and Semantics in the Brain with Deep Networks
Disentangling Syntax and Semantics in the Brain with Deep Networks
Charlotte Caucheteux
Alexandre Gramfort
J. King
129
74
0
02 Mar 2021
A Brief Summary of Interactions Between Meta-Learning and
  Self-Supervised Learning
A Brief Summary of Interactions Between Meta-Learning and Self-Supervised Learning
Huimin Peng
SSL
23
4
0
01 Mar 2021
M6: A Chinese Multimodal Pretrainer
M6: A Chinese Multimodal Pretrainer
Junyang Lin
Rui Men
An Yang
Chan Zhou
Ming Ding
...
Yong Li
Wei Lin
Jingren Zhou
J. Tang
Hongxia Yang
VLMMoE
152
134
0
01 Mar 2021
Unbiased Sentence Encoder For Large-Scale Multi-lingual Search Engines
Unbiased Sentence Encoder For Large-Scale Multi-lingual Search Engines
Mahdi Hajiaghayi
Monir Hajiaghayi
Mark R. Bolin
39
0
0
01 Mar 2021
Detecting Harmful Content On Online Platforms: What Platforms Need Vs.
  Where Research Efforts Go
Detecting Harmful Content On Online Platforms: What Platforms Need Vs. Where Research Efforts Go
Arnav Arora
Preslav Nakov
Momchil Hardalov
Sheikh Muhammad Sarwar
Vibha Nayak
...
Dimitrina Zlatkova
Kyle Dent
Ameya Bhatawdekar
Guillaume Bouchard
Isabelle Augenstein
90
53
0
27 Feb 2021
Automated essay scoring using efficient transformer-based language
  models
Automated essay scoring using efficient transformer-based language models
C. Ormerod
Akanksha Malhotra
Amir Jafari
46
31
0
25 Feb 2021
ZJUKLAB at SemEval-2021 Task 4: Negative Augmentation with Language
  Model for Reading Comprehension of Abstract Meaning
ZJUKLAB at SemEval-2021 Task 4: Negative Augmentation with Language Model for Reading Comprehension of Abstract Meaning
Xin Xie
Xiangnan Chen
Xiang Chen
Yong Wang
Ningyu Zhang
Shumin Deng
Huajun Chen
63
2
0
25 Feb 2021
LazyFormer: Self Attention with Lazy Update
LazyFormer: Self Attention with Lazy Update
Chengxuan Ying
Guolin Ke
Di He
Tie-Yan Liu
81
16
0
25 Feb 2021
When Attention Meets Fast Recurrence: Training Language Models with
  Reduced Compute
When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute
Tao Lei
RALMVLM
149
49
0
24 Feb 2021
LRG at SemEval-2021 Task 4: Improving Reading Comprehension with
  Abstract Words using Augmentation, Linguistic Features and Voting
LRG at SemEval-2021 Task 4: Improving Reading Comprehension with Abstract Words using Augmentation, Linguistic Features and Voting
Abheesht Sharma
Harshit Pandey
Gunjan Chhablani
Yash Bhartia
T. Dash
50
1
0
24 Feb 2021
Hopeful_Men@LT-EDI-EACL2021: Hope Speech Detection Using Indic
  Transliteration and Transformers
Hopeful_Men@LT-EDI-EACL2021: Hope Speech Detection Using Indic Transliteration and Transformers
I. S. Upadhyay
E. Nikhil
Anshul Wadhawan
R. Mamidi
54
14
0
24 Feb 2021
Do Transformer Modifications Transfer Across Implementations and
  Applications?
Do Transformer Modifications Transfer Across Implementations and Applications?
Sharan Narang
Hyung Won Chung
Yi Tay
W. Fedus
Thibault Févry
...
Wei Li
Nan Ding
Jake Marcus
Adam Roberts
Colin Raffel
100
128
0
23 Feb 2021
LogME: Practical Assessment of Pre-trained Models for Transfer Learning
LogME: Practical Assessment of Pre-trained Models for Transfer Learning
Kaichao You
Yong Liu
Jianmin Wang
Mingsheng Long
99
188
0
22 Feb 2021
Using Prior Knowledge to Guide BERT's Attention in Semantic Textual
  Matching Tasks
Using Prior Knowledge to Guide BERT's Attention in Semantic Textual Matching Tasks
Tingyu Xia
Yue Wang
Yuan Tian
Yi-Ju Chang
65
51
0
22 Feb 2021
UniT: Multimodal Multitask Learning with a Unified Transformer
UniT: Multimodal Multitask Learning with a Unified Transformer
Ronghang Hu
Amanpreet Singh
ViT
106
301
0
22 Feb 2021
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for
  Image Captioning
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning
Jun Chen
Han Guo
Kai Yi
Boyang Albert Li
Mohamed Elhoseiny
VLM
164
227
0
20 Feb 2021
Multilingual Answer Sentence Reranking via Automatically Translated Data
Multilingual Answer Sentence Reranking via Automatically Translated Data
Thuy Vu
Alessandro Moschitti
66
5
0
20 Feb 2021
Analyzing Curriculum Learning for Sentiment Analysis along Task
  Difficulty, Pacing and Visualization Axes
Analyzing Curriculum Learning for Sentiment Analysis along Task Difficulty, Pacing and Visualization Axes
Anvesh Rao Vijjini
Kaveri Anuranjana
R. Mamidi
70
3
0
19 Feb 2021
Towards Emotion Recognition in Hindi-English Code-Mixed Data: A
  Transformer Based Approach
Towards Emotion Recognition in Hindi-English Code-Mixed Data: A Transformer Based Approach
Anshul Wadhawan
Akshita Aggarwal
59
32
0
19 Feb 2021
MUDES: Multilingual Detection of Offensive Spans
MUDES: Multilingual Detection of Offensive Spans
Tharindu Ranasinghe
Marcos Zampieri
83
41
0
18 Feb 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize
  Long-Tail Visual Concepts
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
517
1,143
0
17 Feb 2021
Highly Fast Text Segmentation With Pairwise Markov Chains
Highly Fast Text Segmentation With Pairwise Markov Chains
E. Azeraf
E. Monfrini
Emmanuel Vignon
W. Pieczynski
56
5
0
17 Feb 2021
COCO-LM: Correcting and Contrasting Text Sequences for Language Model
  Pretraining
COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
Yu Meng
Chenyan Xiong
Payal Bajaj
Saurabh Tiwary
Paul N. Bennett
Jiawei Han
Xia Song
182
206
0
16 Feb 2021
Conversations Gone Alright: Quantifying and Predicting Prosocial
  Outcomes in Online Conversations
Conversations Gone Alright: Quantifying and Predicting Prosocial Outcomes in Online Conversations
Jiajun Bao
J. Wu
Yiming Zhang
Eshwar Chandrasekharan
David Jurgens
117
49
0
16 Feb 2021
Exploring Transformers in Natural Language Generation: GPT, BERT, and
  XLNet
Exploring Transformers in Natural Language Generation: GPT, BERT, and XLNet
M. O. Topal
Anil Bas
Imke van Heerden
LLMAGAI4CE
73
91
0
16 Feb 2021
Improving speech recognition models with small samples for air traffic
  control systems
Improving speech recognition models with small samples for air traffic control systems
Yi Lin
Qin Li
Bo Yang
Zhen Yan
Huachun Tan
Zhengmao Chen
104
32
0
16 Feb 2021
Have Attention Heads in BERT Learned Constituency Grammar?
Have Attention Heads in BERT Learned Constituency Grammar?
Ziyang Luo
56
6
0
16 Feb 2021
Improved Customer Transaction Classification using Semi-Supervised
  Knowledge Distillation
Improved Customer Transaction Classification using Semi-Supervised Knowledge Distillation
Rohan Sukumaran
29
2
0
15 Feb 2021
PAQ: 65 Million Probably-Asked Questions and What You Can Do With Them
PAQ: 65 Million Probably-Asked Questions and What You Can Do With Them
Patrick Lewis
Yuxiang Wu
Linqing Liu
Pasquale Minervini
Heinrich Küttler
Aleksandra Piktus
Pontus Stenetorp
Sebastian Riedel
RALM
138
234
0
13 Feb 2021
Capturing Label Distribution: A Case Study in NLI
Capturing Label Distribution: A Case Study in NLI
Shujian Zhang
Chengyue Gong
Eunsol Choi
77
8
0
13 Feb 2021
Previous
123...454647...575859
Next