ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations
v1v2v3v4v5v6 (latest)

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSLAIMat
ArXiv (abs)PDFHTMLGithub (3271★)

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,935 papers shown
Title
You Are What You Write: Preserving Privacy in the Era of Large Language
  Models
You Are What You Write: Preserving Privacy in the Era of Large Language Models
Richard Plant
V. Giuffrida
Dimitra Gkatzia
PILM
97
19
0
20 Apr 2022
ALBETO and DistilBETO: Lightweight Spanish Language Models
ALBETO and DistilBETO: Lightweight Spanish Language Models
J. Canete
S. Donoso
Felipe Bravo-Marquez
Andrés Carvallo
Vladimir Araujo
74
21
0
19 Apr 2022
Optimize_Prime@DravidianLangTech-ACL2022: Abusive Comment Detection in
  Tamil
Optimize_Prime@DravidianLangTech-ACL2022: Abusive Comment Detection in Tamil
Shantanu Patankar
Omkar Gokhale
Onkar Litake
Aditya Mandke
Dipali M. Kadam
49
6
0
19 Apr 2022
Optimize_Prime@DravidianLangTech-ACL2022: Emotion Analysis in Tamil
Optimize_Prime@DravidianLangTech-ACL2022: Emotion Analysis in Tamil
Omkar Gokhale
Shantanu Patankar
Onkar Litake
Aditya Mandke
Dipali M. Kadam
48
1
0
19 Apr 2022
On The Cross-Modal Transfer from Natural Language to Code through
  Adapter Modules
On The Cross-Modal Transfer from Natural Language to Code through Adapter Modules
Divyam Goel
Raman Grover
Fatemeh H. Fard
76
19
0
19 Apr 2022
Self Supervised Adversarial Domain Adaptation for Cross-Corpus and
  Cross-Language Speech Emotion Recognition
Self Supervised Adversarial Domain Adaptation for Cross-Corpus and Cross-Language Speech Emotion Recognition
S. Latif
R. Rana
Sara Khalifa
Raja Jurdak
Björn Schuller
109
50
0
19 Apr 2022
Imagination-Augmented Natural Language Understanding
Imagination-Augmented Natural Language Understanding
Yujie Lu
Wanrong Zhu
Xinze Wang
Miguel P. Eckstein
William Yang Wang
62
24
0
18 Apr 2022
L3Cube-HingCorpus and HingBERT: A Code Mixed Hindi-English Dataset and
  BERT Language Models
L3Cube-HingCorpus and HingBERT: A Code Mixed Hindi-English Dataset and BERT Language Models
Ravindra Nayak
Raviraj Joshi
62
42
0
18 Apr 2022
Back to the Future: Bidirectional Information Decoupling Network for
  Multi-turn Dialogue Modeling
Back to the Future: Bidirectional Information Decoupling Network for Multi-turn Dialogue Modeling
Yiyang Li
Hai Zhao
Zhuosheng Zhang
57
11
0
18 Apr 2022
RLens: A Computer-aided Visualization System for Supporting Reflection
  on Language Learning under Distributed Tutorship
RLens: A Computer-aided Visualization System for Supporting Reflection on Language Learning under Distributed Tutorship
Menglin Xia
Yankun Zhao
Jihyeong Hong
Mehmet Hamza Erol
Taewook Kim
Juho Kim
20
0
0
17 Apr 2022
A Survivor in the Era of Large-Scale Pretraining: An Empirical Study of
  One-Stage Referring Expression Comprehension
A Survivor in the Era of Large-Scale Pretraining: An Empirical Study of One-Stage Referring Expression Comprehension
Gen Luo
Yiyi Zhou
Jiamu Sun
Xiaoshuai Sun
Rongrong Ji
ObjD
78
10
0
17 Apr 2022
XDBERT: Distilling Visual Information to BERT from Cross-Modal Systems
  to Improve Language Understanding
XDBERT: Distilling Visual Information to BERT from Cross-Modal Systems to Improve Language Understanding
Chan-Jan Hsu
Hung-yi Lee
Yu Tsao
VLM
42
3
0
15 Apr 2022
MiniViT: Compressing Vision Transformers with Weight Multiplexing
MiniViT: Compressing Vision Transformers with Weight Multiplexing
Jinnian Zhang
Houwen Peng
Kan Wu
Mengchen Liu
Bin Xiao
Jianlong Fu
Lu Yuan
ViT
114
127
0
14 Apr 2022
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding
  Language Models with Model Generated Signals
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals
Payal Bajaj
Chenyan Xiong
Guolin Ke
Xiaodong Liu
Di He
Saurabh Tiwary
Tie-Yan Liu
Paul N. Bennett
Xia Song
Jianfeng Gao
118
32
0
13 Apr 2022
L3Cube-MahaNER: A Marathi Named Entity Recognition Dataset and BERT
  models
L3Cube-MahaNER: A Marathi Named Entity Recognition Dataset and BERT models
Parth Patil
Aparna Ranade
Maithili Sabane
Onkar Litake
Raviraj Joshi
94
20
0
12 Apr 2022
What Language Model Architecture and Pretraining Objective Work Best for
  Zero-Shot Generalization?
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?
Thomas Wang
Adam Roberts
Daniel Hesslow
Teven Le Scao
Hyung Won Chung
Iz Beltagy
Julien Launay
Colin Raffel
143
175
0
12 Apr 2022
What do Toothbrushes do in the Kitchen? How Transformers Think our World
  is Structured
What do Toothbrushes do in the Kitchen? How Transformers Think our World is Structured
Alexander Henlein
Alexander Mehler
70
6
0
12 Apr 2022
Bridging the Gap between Language Models and Cross-Lingual Sequence
  Labeling
Bridging the Gap between Language Models and Cross-Lingual Sequence Labeling
Nuo Chen
Linjun Shou
Ming Gong
Jian Pei
Daxin Jiang
65
10
0
11 Apr 2022
TRUE: Re-evaluating Factual Consistency Evaluation
TRUE: Re-evaluating Factual Consistency Evaluation
Or Honovich
Roee Aharoni
Jonathan Herzig
Hagai Taitelbaum
Doron Kukliansy
Vered Cohen
Thomas Scialom
Idan Szpektor
Avinatan Hassidim
Yossi Matias
HILM
81
4
0
11 Apr 2022
A Comparative Study of Pre-trained Encoders for Low-Resource Named
  Entity Recognition
A Comparative Study of Pre-trained Encoders for Low-Resource Named Entity Recognition
Yuxuan Chen
Jonas Mikkelsen
Arne Binder
Christoph Alt
Leonhard Hennig
72
2
0
11 Apr 2022
Data Augmentation for Biomedical Factoid Question Answering
Data Augmentation for Biomedical Factoid Question Answering
Dimitris Pappas
Prodromos Malakasiotis
Ion Androutsopoulos
MedIm
72
12
0
10 Apr 2022
Benchmarking for Public Health Surveillance tasks on Social Media with a
  Domain-Specific Pretrained Language Model
Benchmarking for Public Health Surveillance tasks on Social Media with a Domain-Specific Pretrained Language Model
Usman Naseem
Byoung Chan Lee
Matloob Khushi
Jinman Kim
A. Dunn
AI4MHLM&MAVLM
42
33
0
09 Apr 2022
MINER: Improving Out-of-Vocabulary Named Entity Recognition from an
  Information Theoretic Perspective
MINER: Improving Out-of-Vocabulary Named Entity Recognition from an Information Theoretic Perspective
Xiao Wang
Shihan Dou
Li Xiong
Yicheng Zou
Qi Zhang
Tao Gui
Liang Qiao
Zhanzhan Cheng
Xuanjing Huang
71
27
0
09 Apr 2022
Improving Tokenisation by Alternative Treatment of Spaces
Improving Tokenisation by Alternative Treatment of Spaces
Edward Gow-Smith
Harish Tayyar Madabushi
Carolina Scarton
Aline Villavicencio
89
21
0
08 Apr 2022
Are We Really Making Much Progress in Text Classification? A Comparative
  Review
Are We Really Making Much Progress in Text Classification? A Comparative Review
Lukas Galke
Andor Diera
Bao Xin Lin
Bhakti Khera
Tim Meuser
Tushar Singhal
Fabian Karl
A. Scherp
VLM
89
4
0
08 Apr 2022
PharmMT: A Neural Machine Translation Approach to Simplify Prescription
  Directions
PharmMT: A Neural Machine Translation Approach to Simplify Prescription Directions
Jiazhao Li
Corey A. Lester
Xinyan Zhao
Yuting Ding
Yun Jiang
V. Vydiswaran
MedIm
57
13
0
08 Apr 2022
Autoencoding Language Model Based Ensemble Learning for Commonsense
  Validation and Explanation
Autoencoding Language Model Based Ensemble Learning for Commonsense Validation and Explanation
Ngo Quang Huy
Tu Minh Phuong
Ngo Xuan Bach
LRM
63
5
0
07 Apr 2022
PALBERT: Teaching ALBERT to Ponder
PALBERT: Teaching ALBERT to Ponder
Nikita Balagansky
Daniil Gavrilov
MoE
49
6
0
07 Apr 2022
Accelerating Attention through Gradient-Based Learned Runtime Pruning
Accelerating Attention through Gradient-Based Learned Runtime Pruning
Zheng Li
Soroush Ghodrati
Amir Yazdanbakhsh
H. Esmaeilzadeh
Mingu Kang
83
18
0
07 Apr 2022
The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems
The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems
Caleb Ziems
Jane A. Yu
Yi-Chia Wang
A. Halevy
Diyi Yang
87
97
0
06 Apr 2022
Paying More Attention to Self-attention: Improving Pre-trained Language
  Models via Attention Guiding
Paying More Attention to Self-attention: Improving Pre-trained Language Models via Attention Guiding
Shanshan Wang
Zhumin Chen
Zhaochun Ren
Huasheng Liang
Qiang Yan
Fajie Yuan
57
9
0
06 Apr 2022
Probing Structured Pruning on Multilingual Pre-trained Models: Settings,
  Algorithms, and Efficiency
Probing Structured Pruning on Multilingual Pre-trained Models: Settings, Algorithms, and Efficiency
Yanyang Li
Fuli Luo
Runxin Xu
Songfang Huang
Fei Huang
Liwei Wang
69
3
0
06 Apr 2022
Improved and Efficient Conversational Slot Labeling through Question
  Answering
Improved and Efficient Conversational Slot Labeling through Question Answering
Gabor Fuisz
Ivan Vulić
Samuel Gibbons
I. Casanueva
Paweł Budzianowski
91
11
0
05 Apr 2022
Fact Checking with Insufficient Evidence
Fact Checking with Insufficient Evidence
Pepa Atanasova
J. Simonsen
Christina Lioma
Isabelle Augenstein
116
15
0
05 Apr 2022
MaxViT: Multi-Axis Vision Transformer
MaxViT: Multi-Axis Vision Transformer
Zhengzhong Tu
Hossein Talebi
Han Zhang
Feng Yang
P. Milanfar
A. Bovik
Yinxiao Li
ViT
163
676
0
04 Apr 2022
Efficient comparison of sentence embeddings
Efficient comparison of sentence embeddings
Spyros Zoupanos
Stratis Kolovos
Athanasios Kanavos
Orestis Papadimitriou
M. Maragoudakis
35
12
0
02 Apr 2022
Multifaceted Improvements for Conversational Open-Domain Question
  Answering
Multifaceted Improvements for Conversational Open-Domain Question Answering
Tingting Liang
Yixuan Jiang
Congying Xia
Ziqiang Zhao
Yuyu Yin
Philip S. Yu
39
4
0
01 Apr 2022
Syntax-informed Question Answering with Heterogeneous Graph Transformer
Syntax-informed Question Answering with Heterogeneous Graph Transformer
Fangyi Zhu
Lok You Tan
See-Kiong Ng
S. Bressan
99
3
0
01 Apr 2022
COOL, a Context Outlooker, and its Application to Question Answering and
  other Natural Language Processing Tasks
COOL, a Context Outlooker, and its Application to Question Answering and other Natural Language Processing Tasks
Fangyi Zhu
See-Kiong Ng
S. Bressan
LRM
58
1
0
01 Apr 2022
Making Pre-trained Language Models End-to-end Few-shot Learners with
  Contrastive Prompt Tuning
Making Pre-trained Language Models End-to-end Few-shot Learners with Contrastive Prompt Tuning
Ziyun Xu
Chengyu Wang
Minghui Qiu
Fuli Luo
Runxin Xu
Songfang Huang
Jun Huang
VLM
103
34
0
01 Apr 2022
Domain Adaptation for Sparse-Data Settings: What Do We Gain by Not Using
  Bert?
Domain Adaptation for Sparse-Data Settings: What Do We Gain by Not Using Bert?
Marina Sedinkina
Martin Schmitt
Hinrich Schutze
36
1
0
31 Mar 2022
indic-punct: An automatic punctuation restoration and inverse text
  normalization framework for Indic languages
indic-punct: An automatic punctuation restoration and inverse text normalization framework for Indic languages
Anirudh Gupta
Neeraj Chhimwal
Ankur Dhuriya
Rishabh Gaur
Priyanshi Shah
Harveen Singh Chadha
Vivek Raghavan
24
4
0
31 Mar 2022
How Pre-trained Language Models Capture Factual Knowledge? A
  Causal-Inspired Analysis
How Pre-trained Language Models Capture Factual Knowledge? A Causal-Inspired Analysis
Shaobo Li
Xiaoguang Li
Lifeng Shang
Zhenhua Dong
Chengjie Sun
Bingquan Liu
Zhenzhou Ji
Xin Jiang
Qun Liu
KELM
94
55
0
31 Mar 2022
Vakyansh: ASR Toolkit for Low Resource Indic languages
Vakyansh: ASR Toolkit for Low Resource Indic languages
Harveen Singh Chadha
Anirudh Gupta
Priyanshi Shah
Neeraj Chhimwal
Ankur Dhuriya
Rishabh Gaur
Vivek Raghavan
29
17
0
30 Mar 2022
Dual Temperature Helps Contrastive Learning Without Many Negative
  Samples: Towards Understanding and Simplifying MoCo
Dual Temperature Helps Contrastive Learning Without Many Negative Samples: Towards Understanding and Simplifying MoCo
Chaoning Zhang
Kang Zhang
T. Pham
Axi Niu
Zhinan Qiao
Chang D. Yoo
In So Kweon
117
57
0
30 Mar 2022
How Does SimSiam Avoid Collapse Without Negative Samples? A Unified
  Understanding with Self-supervised Contrastive Learning
How Does SimSiam Avoid Collapse Without Negative Samples? A Unified Understanding with Self-supervised Contrastive Learning
Chaoning Zhang
Kang Zhang
Chenshuang Zhang
T. Pham
Chang D. Yoo
In So Kweon
SSL
106
74
0
30 Mar 2022
A Fast Transformer-based General-Purpose Lossless Compressor
A Fast Transformer-based General-Purpose Lossless Compressor
Yushun Mao
Yufei Cui
Tei-Wei Kuo
Chun Jason Xue
ViTAI4CE
89
34
0
30 Mar 2022
A Fast Post-Training Pruning Framework for Transformers
A Fast Post-Training Pruning Framework for Transformers
Woosuk Kwon
Sehoon Kim
Michael W. Mahoney
Joseph Hassoun
Kurt Keutzer
A. Gholami
113
157
0
29 Mar 2022
Discovering material information using hierarchical Reformer model on
  financial regulatory filings
Discovering material information using hierarchical Reformer model on financial regulatory filings
Francois Mercier
Makesh Narsimhan
AIFinAI4TS
22
0
0
28 Mar 2022
ANNA: Enhanced Language Representation for Question Answering
ANNA: Enhanced Language Representation for Question Answering
Changwook Jun
Hansol Jang
Myoseop Sim
Hyun Kim
Jooyoung Choi
Kyungkoo Min
Kyunghoon Bae
73
8
0
28 Mar 2022
Previous
123...303132...575859
Next