ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSL
    AIMat
ArXivPDFHTML

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,913 papers shown
Title
Contextual Text Embeddings for Twi
Contextual Text Embeddings for Twi
P. Azunre
Salomey Osei
S. Addo
Lawrence Asamoah Adu-Gyamfi
Stephen E. Moore
...
Standylove Birago Mensah
Lucien Mensah
Mark Amoako Marcel
A. Amponsah
J. B. Hayfron-Acquah
18
6
0
29 Mar 2021
Retraining DistilBERT for a Voice Shopping Assistant by Using Universal
  Dependencies
Retraining DistilBERT for a Voice Shopping Assistant by Using Universal Dependencies
P. Jayarao
Arpit Sharma
21
2
0
29 Mar 2021
ViViT: A Video Vision Transformer
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
33
2,098
0
29 Mar 2021
Machine Learning Meets Natural Language Processing -- The story so far
Machine Learning Meets Natural Language Processing -- The story so far
N. Galanis
P. Vafiadis
K.-G. Mirzaev
G. Papakostas
43
7
0
27 Mar 2021
A Practical Survey on Faster and Lighter Transformers
A Practical Survey on Faster and Lighter Transformers
Quentin Fournier
G. Caron
Daniel Aloise
19
93
0
26 Mar 2021
Unsupervised Document Embedding via Contrastive Augmentation
Unsupervised Document Embedding via Contrastive Augmentation
Dongsheng Luo
Wei Cheng
Jingchao Ni
Wenchao Yu
Xuchao Zhang
...
Yanchi Liu
Zhengzhang Chen
Dongjin Song
Haifeng Chen
Xiang Zhang
SSL
31
11
0
26 Mar 2021
AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent
  Forecasting
AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting
Ye Yuan
Xinshuo Weng
Yanglan Ou
Kris Kitani
AI4TS
45
442
0
25 Mar 2021
Visual Grounding Strategies for Text-Only Natural Language Processing
Visual Grounding Strategies for Text-Only Natural Language Processing
Damien Sileo
29
8
0
25 Mar 2021
Bertinho: Galician BERT Representations
Bertinho: Galician BERT Representations
David Vilares
Marcos Garcia
Carlos Gómez-Rodríguez
70
22
0
25 Mar 2021
Pruning-then-Expanding Model for Domain Adaptation of Neural Machine
  Translation
Pruning-then-Expanding Model for Domain Adaptation of Neural Machine Translation
Shuhao Gu
Yang Feng
Wanying Xie
CLL
AI4CE
25
27
0
25 Mar 2021
Predicting Directionality in Causal Relations in Text
Predicting Directionality in Causal Relations in Text
Pedram Hosseini
David A. Broniatowski
Mona T. Diab
CML
27
11
0
25 Mar 2021
Finetuning Pretrained Transformers into RNNs
Finetuning Pretrained Transformers into RNNs
Jungo Kasai
Hao Peng
Yizhe Zhang
Dani Yogatama
Gabriel Ilharco
Nikolaos Pappas
Yi Mao
Weizhu Chen
Noah A. Smith
46
63
0
24 Mar 2021
Czert -- Czech BERT-like Model for Language Representation
Czert -- Czech BERT-like Model for Language Representation
Jakub Sido
O. Pražák
P. Pribán
Jan Pasek
Michal Seják
Miloslav Konopík
31
43
0
24 Mar 2021
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New
  Multitask Benchmark
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark
Nicholas Lourie
Ronan Le Bras
Chandra Bhagavatula
Yejin Choi
LRM
30
137
0
24 Mar 2021
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning
  Architectures
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures
Sushant Singh
A. Mahmood
AI4TS
60
94
0
23 Mar 2021
Tiny Transformers for Environmental Sound Classification at the Edge
Tiny Transformers for Environmental Sound Classification at the Edge
David Elliott
Carlos E. Otero
Steven Wyatt
Evan Martino
26
15
0
22 Mar 2021
End-to-End Trainable Multi-Instance Pose Estimation with Transformers
End-to-End Trainable Multi-Instance Pose Estimation with Transformers
Lucas Stoffl
Maxime Vidal
Alexander Mathis
ViT
25
49
0
22 Mar 2021
Improving and Simplifying Pattern Exploiting Training
Improving and Simplifying Pattern Exploiting Training
Derek Tam
Rakesh R Menon
Joey Tianyi Zhou
Shashank Srivastava
Colin Raffel
21
149
0
22 Mar 2021
Identifying Machine-Paraphrased Plagiarism
Identifying Machine-Paraphrased Plagiarism
Jan Philip Wahle
Terry Ruas
Tomávs Foltýnek
Norman Meuschke
Bela Gipp
13
30
0
22 Mar 2021
ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques
ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques
Yuanxin Liu
Zheng Lin
Fengcheng Yuan
VLM
MQ
10
18
0
21 Mar 2021
AdaptSum: Towards Low-Resource Domain Adaptation for Abstractive
  Summarization
AdaptSum: Towards Low-Resource Domain Adaptation for Abstractive Summarization
Tiezheng Yu
Zihan Liu
Pascale Fung
CLL
51
81
0
21 Mar 2021
Pretraining the Noisy Channel Model for Task-Oriented Dialogue
Pretraining the Noisy Channel Model for Task-Oriented Dialogue
Qi Liu
Lei Yu
Laura Rimell
Phil Blunsom
47
26
0
18 Mar 2021
Refining Language Models with Compositional Explanations
Refining Language Models with Compositional Explanations
Huihan Yao
Ying Chen
Qinyuan Ye
Xisen Jin
Xiang Ren
20
35
0
18 Mar 2021
GLM: General Language Model Pretraining with Autoregressive Blank
  Infilling
GLM: General Language Model Pretraining with Autoregressive Blank Infilling
Zhengxiao Du
Yujie Qian
Xiao Liu
Ming Ding
J. Qiu
Zhilin Yang
Jie Tang
BDL
AI4CE
53
1,496
0
18 Mar 2021
Structure Inducing Pre-Training
Structure Inducing Pre-Training
Matthew B. A. McDermott
Brendan Yap
Peter Szolovits
Marinka Zitnik
42
18
0
18 Mar 2021
Large-Scale Zero-Shot Image Classification from Rich and Diverse Textual
  Descriptions
Large-Scale Zero-Shot Image Classification from Rich and Diverse Textual Descriptions
Sebastian Bujwid
Josephine Sullivan
VLM
23
28
0
17 Mar 2021
Robustly Optimized and Distilled Training for Natural Language
  Understanding
Robustly Optimized and Distilled Training for Natural Language Understanding
Haytham ElFadeel
Stanislav Peshterliev
VLM
OffRL
25
1
0
16 Mar 2021
LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time
  Image-Text Retrieval
LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval
Siqi Sun
Yen-Chun Chen
Linjie Li
Shuohang Wang
Yuwei Fang
Jingjing Liu
VLM
41
82
0
16 Mar 2021
How Many Data Points is a Prompt Worth?
How Many Data Points is a Prompt Worth?
Teven Le Scao
Alexander M. Rush
VLM
66
296
0
15 Mar 2021
SemVLP: Vision-Language Pre-training by Aligning Semantics at Multiple
  Levels
SemVLP: Vision-Language Pre-training by Aligning Semantics at Multiple Levels
Chenliang Li
Ming Yan
Haiyang Xu
Fuli Luo
Wei Wang
Bin Bi
Songfang Huang
VLM
34
36
0
14 Mar 2021
Text Mining of Stocktwits Data for Predicting Stock Prices
Text Mining of Stocktwits Data for Predicting Stock Prices
Mukul Jaggi
Priyanka Mandal
Shreya Narang
Usman Naseem
Matloob Khushi
AIFin
24
41
0
13 Mar 2021
CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review
CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review
Dan Hendrycks
Collin Burns
Anya Chen
Spencer Ball
ELM
AILaw
25
185
0
10 Mar 2021
Team Phoenix at WASSA 2021: Emotion Analysis on News Stories with
  Pre-Trained Language Models
Team Phoenix at WASSA 2021: Emotion Analysis on News Stories with Pre-Trained Language Models
Yash Butala
Kanishk Singh
Adarsh Kumar
Shrey Shrivastava
17
10
0
10 Mar 2021
Wav2vec-C: A Self-supervised Model for Speech Representation Learning
Wav2vec-C: A Self-supervised Model for Speech Representation Learning
Samik Sadhu
Di He
Che-Wei Huang
Sri Harish Reddy Mallidi
Minhua Wu
Ariya Rastrow
A. Stolcke
J. Droppo
Roland Maas
SSL
20
48
0
09 Mar 2021
Beyond Nyströmformer -- Approximation of self-attention by Spectral
  Shifting
Beyond Nyströmformer -- Approximation of self-attention by Spectral Shifting
Madhusudan Verma
22
1
0
09 Mar 2021
BERTese: Learning to Speak to BERT
BERTese: Learning to Speak to BERT
Adi Haviv
Jonathan Berant
Amir Globerson
30
123
0
09 Mar 2021
Self-supervised Regularization for Text Classification
Self-supervised Regularization for Text Classification
Meng Zhou
Zechen Li
P. Xie
26
16
0
09 Mar 2021
Improving Document-Level Sentiment Classification Using Importance of
  Sentences
Improving Document-Level Sentiment Classification Using Importance of Sentences
Gihyeon Choi
Shinhyeok Oh
H. Kim
31
27
0
09 Mar 2021
MCR-Net: A Multi-Step Co-Interactive Relation Network for Unanswerable
  Questions on Machine Reading Comprehension
MCR-Net: A Multi-Step Co-Interactive Relation Network for Unanswerable Questions on Machine Reading Comprehension
Wei Peng
Yue Hu
Jiahao Yu
Luxi Xing
Yuqiang Xie
Zihao Zhu
Yajing Sun
22
2
0
08 Mar 2021
Split Computing and Early Exiting for Deep Learning Applications: Survey
  and Research Challenges
Split Computing and Early Exiting for Deep Learning Applications: Survey and Research Challenges
Yoshitomo Matsubara
Marco Levorato
Francesco Restuccia
35
199
0
08 Mar 2021
Pufferfish: Communication-efficient Models At No Extra Cost
Pufferfish: Communication-efficient Models At No Extra Cost
Hongyi Wang
Saurabh Agarwal
Dimitris Papailiopoulos
19
56
0
05 Mar 2021
Rissanen Data Analysis: Examining Dataset Characteristics via
  Description Length
Rissanen Data Analysis: Examining Dataset Characteristics via Description Length
Ethan Perez
Douwe Kiela
Kyunghyun Cho
32
24
0
05 Mar 2021
Attention is Not All You Need: Pure Attention Loses Rank Doubly
  Exponentially with Depth
Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth
Yihe Dong
Jean-Baptiste Cordonnier
Andreas Loukas
57
373
0
05 Mar 2021
Moshpit SGD: Communication-Efficient Decentralized Training on
  Heterogeneous Unreliable Devices
Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices
Max Ryabinin
Eduard A. Gorbunov
Vsevolod Plokhotnyuk
Gennady Pekhimenko
42
33
0
04 Mar 2021
Perceiver: General Perception with Iterative Attention
Perceiver: General Perception with Iterative Attention
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLM
ViT
MDE
91
978
0
04 Mar 2021
Weakly-Supervised Open-Retrieval Conversational Question Answering
Weakly-Supervised Open-Retrieval Conversational Question Answering
Chen Qu
Liu Yang
Cen Chen
W. Bruce Croft
Kalpesh Krishna
Mohit Iyyer
RALM
14
13
0
03 Mar 2021
Disentangling Syntax and Semantics in the Brain with Deep Networks
Disentangling Syntax and Semantics in the Brain with Deep Networks
Charlotte Caucheteux
Alexandre Gramfort
J. King
36
70
0
02 Mar 2021
A Brief Summary of Interactions Between Meta-Learning and
  Self-Supervised Learning
A Brief Summary of Interactions Between Meta-Learning and Self-Supervised Learning
Huimin Peng
SSL
11
4
0
01 Mar 2021
M6: A Chinese Multimodal Pretrainer
M6: A Chinese Multimodal Pretrainer
Junyang Lin
Rui Men
An Yang
Chan Zhou
Ming Ding
...
Yong Li
Wei Lin
Jingren Zhou
J. Tang
Hongxia Yang
VLM
MoE
37
133
0
01 Mar 2021
Unbiased Sentence Encoder For Large-Scale Multi-lingual Search Engines
Unbiased Sentence Encoder For Large-Scale Multi-lingual Search Engines
Mahdi Hajiaghayi
Monir Hajiaghayi
Mark R. Bolin
26
0
0
01 Mar 2021
Previous
123...444546...575859
Next