ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSL
    AIMat
ArXivPDFHTML

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,913 papers shown
Title
WangchanBERTa: Pretraining transformer-based Thai Language Models
WangchanBERTa: Pretraining transformer-based Thai Language Models
Lalita Lowphansirikul
Charin Polpanumas
Nawat Jantrakulchai
Sarana Nutanong
13
74
0
24 Jan 2021
Debiasing Pre-trained Contextualised Embeddings
Debiasing Pre-trained Contextualised Embeddings
Masahiro Kaneko
Danushka Bollegala
218
137
0
23 Jan 2021
Training Multilingual Pre-trained Language Model with Byte-level
  Subwords
Training Multilingual Pre-trained Language Model with Byte-level Subwords
Junqiu Wei
Qun Liu
Yinpeng Guo
Xin Jiang
33
19
0
23 Jan 2021
Distilling Large Language Models into Tiny and Effective Students using
  pQRNN
Distilling Large Language Models into Tiny and Effective Students using pQRNN
P. Kaliamoorthi
Aditya Siddhant
Edward Li
Melvin Johnson
MQ
21
17
0
21 Jan 2021
PalmTree: Learning an Assembly Language Model for Instruction Embedding
PalmTree: Learning an Assembly Language Model for Instruction Embedding
Xuezixiang Li
Qu Yu
Heng Yin
24
144
0
21 Jan 2021
Adv-OLM: Generating Textual Adversaries via OLM
Adv-OLM: Generating Textual Adversaries via OLM
Vijit Malik
A. Bhat
Ashutosh Modi
35
6
0
21 Jan 2021
Towards Confident Machine Reading Comprehension
Towards Confident Machine Reading Comprehension
Rishav Chakravarti
Avirup Sil
30
4
0
20 Jan 2021
Automatic punctuation restoration with BERT models
Automatic punctuation restoration with BERT models
A. Nagy
Bence Bial
Judit Ács
18
25
0
18 Jan 2021
Model Compression for Domain Adaptation through Causal Effect Estimation
Model Compression for Domain Adaptation through Causal Effect Estimation
Guy Rotman
Amir Feder
Roi Reichart
CML
14
7
0
18 Jan 2021
Red Alarm for Pre-trained Models: Universal Vulnerability to
  Neuron-Level Backdoor Attacks
Red Alarm for Pre-trained Models: Universal Vulnerability to Neuron-Level Backdoor Attacks
Zhengyan Zhang
Guangxuan Xiao
Yongwei Li
Tian Lv
Fanchao Qi
Zhiyuan Liu
Yasheng Wang
Xin Jiang
Maosong Sun
AAML
23
68
0
18 Jan 2021
Efficiently Fusing Pretrained Acoustic and Linguistic Encoders for
  Low-resource Speech Recognition
Efficiently Fusing Pretrained Acoustic and Linguistic Encoders for Low-resource Speech Recognition
Cheng Yi
Shiyu Zhou
Bo Xu
51
40
0
17 Jan 2021
Transformer-Based Models for Question Answering on COVID19
Transformer-Based Models for Question Answering on COVID19
Hillary Ngai
Yoona Park
John Chen
Mahboobeh Parsapoor
OOD
27
21
0
16 Jan 2021
To Understand Representation of Layer-aware Sequence Encoders as
  Multi-order-graph
To Understand Representation of Layer-aware Sequence Encoders as Multi-order-graph
Sufeng Duan
Hai Zhao
MILM
22
0
0
16 Jan 2021
Grid Search Hyperparameter Benchmarking of BERT, ALBERT, and LongFormer
  on DuoRC
Grid Search Hyperparameter Benchmarking of BERT, ALBERT, and LongFormer on DuoRC
Alex John Quijano
Sam Nguyen
Juanita Ordoñez
29
7
0
15 Jan 2021
Hostility Detection and Covid-19 Fake News Detection in Social Media
Hostility Detection and Covid-19 Fake News Detection in Social Media
Ayush Gupta
Rohan Sukumaran
Kevin John
Sundeep Teki
14
20
0
15 Jan 2021
KDLSQ-BERT: A Quantized Bert Combining Knowledge Distillation with
  Learned Step Size Quantization
KDLSQ-BERT: A Quantized Bert Combining Knowledge Distillation with Learned Step Size Quantization
Jing Jin
Cai Liang
Tiancheng Wu
Li Zou
Zhiliang Gan
MQ
19
26
0
15 Jan 2021
Transformer-based Language Model Fine-tuning Methods for COVID-19 Fake
  News Detection
Transformer-based Language Model Fine-tuning Methods for COVID-19 Fake News Detection
Ben Chen
Bin Chen
D. Gao
Qijin Chen
Chengfu Huo
Xiaonan Meng
Weijun Ren
Yang Zhou
41
40
0
14 Jan 2021
Of Non-Linearity and Commutativity in BERT
Of Non-Linearity and Commutativity in BERT
Sumu Zhao
Damian Pascual
Gino Brunner
Roger Wattenhofer
38
16
0
12 Jan 2021
Model Generalization on COVID-19 Fake News Detection
Model Generalization on COVID-19 Fake News Detection
Yejin Bang
Etsuko Ishii
Samuel Cahyawijaya
Ziwei Ji
Pascale Fung
55
36
0
11 Jan 2021
AT-BERT: Adversarial Training BERT for Acronym Identification Winning
  Solution for SDU@AAAI-21
AT-BERT: Adversarial Training BERT for Acronym Identification Winning Solution for SDU@AAAI-21
Danqing Zhu
Wangli Lin
Yang Zhang
Qiwei Zhong
Guanxiong Zeng
Weilin Wu
Jiayu Tang
31
17
0
11 Jan 2021
BERT & Family Eat Word Salad: Experiments with Text Understanding
BERT & Family Eat Word Salad: Experiments with Text Understanding
Ashim Gupta
Giorgi Kvernadze
Vivek Srikumar
211
73
0
10 Jan 2021
Political Depolarization of News Articles Using Attribute-aware Word
  Embeddings
Political Depolarization of News Articles Using Attribute-aware Word Embeddings
Ruibo Liu
Lili Wang
Chenyan Jia
Soroush Vosoughi
27
20
0
05 Jan 2021
I-BERT: Integer-only BERT Quantization
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
107
345
0
05 Jan 2021
Benchmarking Knowledge-Enhanced Commonsense Question Answering via
  Knowledge-to-Text Transformation
Benchmarking Knowledge-Enhanced Commonsense Question Answering via Knowledge-to-Text Transformation
Ning Bian
Xianpei Han
Bo Chen
Le Sun
ELM
11
43
0
04 Jan 2021
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
Wangchunshu Zhou
Tao Ge
Canwen Xu
Ke Xu
Furu Wei
LRM
16
15
0
02 Jan 2021
Which Linguist Invented the Lightbulb? Presupposition Verification for
  Question-Answering
Which Linguist Invented the Lightbulb? Presupposition Verification for Question-Answering
Najoung Kim
Ellie Pavlick
Burcu Karagol Ayan
Deepak Ramachandran
81
43
0
02 Jan 2021
RiddleSense: Reasoning about Riddle Questions Featuring Linguistic
  Creativity and Commonsense Knowledge
RiddleSense: Reasoning about Riddle Questions Featuring Linguistic Creativity and Commonsense Knowledge
Bill Yuchen Lin
Ziyi Wu
Yichi Yang
Dong-Ho Lee
Xiang Ren
ReLM
LRM
249
64
0
02 Jan 2021
Subformer: Exploring Weight Sharing for Parameter Efficiency in
  Generative Transformers
Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers
Machel Reid
Edison Marrese-Taylor
Y. Matsuo
MoE
22
48
0
01 Jan 2021
BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource
  Language Understanding Evaluation in Bangla
BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla
Abhik Bhattacharjee
Tahmid Hasan
Wasi Uddin Ahmad
Kazi Samin Mubasshir
Md. Saiful Islam
Anindya Iqbal
M. Rahman
Rifat Shahriyar
SSL
VLM
33
166
0
01 Jan 2021
Transformer based Automatic COVID-19 Fake News Detection System
Transformer based Automatic COVID-19 Fake News Detection System
Sunil Gundapu
R. Mamidi
32
70
0
01 Jan 2021
NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons
  Learned
NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned
Sewon Min
Jordan L. Boyd-Graber
Chris Alberti
Danqi Chen
Eunsol Choi
...
Dmytro Okhonko
M. Schlichtkrull
Sonal Gupta
Yashar Mehdad
Wen-tau Yih
28
61
0
01 Jan 2021
WARP: Word-level Adversarial ReProgramming
WARP: Word-level Adversarial ReProgramming
Karen Hambardzumyan
Hrant Khachatrian
Jonathan May
AAML
254
342
0
01 Jan 2021
Towards Modelling Coherence in Spoken Discourse
Towards Modelling Coherence in Spoken Discourse
Rajaswa Patil
Yaman Kumar Singla
R. Shah
Mika Hama
Roger Zimmermann
AuLLM
15
8
0
31 Dec 2020
BinaryBERT: Pushing the Limit of BERT Quantization
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
145
221
0
31 Dec 2020
Better Robustness by More Coverage: Adversarial Training with Mixup
  Augmentation for Robust Fine-tuning
Better Robustness by More Coverage: Adversarial Training with Mixup Augmentation for Robust Fine-tuning
Chenglei Si
Zhengyan Zhang
Fanchao Qi
Zhiyuan Liu
Yasheng Wang
Qun Liu
Maosong Sun
AAML
SILM
25
68
0
31 Dec 2020
Seeing is Knowing! Fact-based Visual Question Answering using Knowledge
  Graph Embeddings
Seeing is Knowing! Fact-based Visual Question Answering using Knowledge Graph Embeddings
Kiran Ramnath
M. Hasegawa-Johnson
19
9
0
31 Dec 2020
CLEAR: Contrastive Learning for Sentence Representation
CLEAR: Contrastive Learning for Sentence Representation
Zhuofeng Wu
Sinong Wang
Jiatao Gu
Madian Khabsa
Fei Sun
Hao Ma
SSL
33
320
0
31 Dec 2020
An Experimental Evaluation of Transformer-based Language Models in the
  Biomedical Domain
An Experimental Evaluation of Transformer-based Language Models in the Biomedical Domain
Paul Grouchy
Shobhit Jain
Michael Liu
Kuhan Wang
Max Tian
Nidhi Arora
Hillary Ngai
Faiza Khan Khattak
Elham Dolatabadi
S. Kocak
LM&MA
MedIm
19
4
0
31 Dec 2020
Optimizing Deeper Transformers on Small Datasets
Optimizing Deeper Transformers on Small Datasets
Peng Xu
Dhruv Kumar
Wei Yang
Wenjie Zi
Keyi Tang
Chenyang Huang
Jackie C.K. Cheung
S. Prince
Yanshuai Cao
AI4CE
24
69
0
30 Dec 2020
SemGloVe: Semantic Co-occurrences for GloVe from BERT
SemGloVe: Semantic Co-occurrences for GloVe from BERT
Leilei Gan
Zhiyang Teng
Yue Zhang
Linchao Zhu
Fei Wu
Yi Yang
14
16
0
30 Dec 2020
Out of Order: How Important Is The Sequential Order of Words in a
  Sentence in Natural Language Understanding Tasks?
Out of Order: How Important Is The Sequential Order of Words in a Sentence in Natural Language Understanding Tasks?
Thang M. Pham
Trung Bui
Long Mai
Anh Totti Nguyen
220
122
0
30 Dec 2020
ERICA: Improving Entity and Relation Understanding for Pre-trained
  Language Models via Contrastive Learning
ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning
Yujia Qin
Yankai Lin
Ryuichi Takanobu
Zhiyuan Liu
Peng Li
Heng Ji
Minlie Huang
Maosong Sun
Jie Zhou
60
125
0
30 Dec 2020
CMV-BERT: Contrastive multi-vocab pretraining of BERT
Wei-wei Zhu
Daniel Cheung
SSL
VLM
19
0
0
29 Dec 2020
Code Summarization with Structure-induced Transformer
Code Summarization with Structure-induced Transformer
Hongqiu Wu
Hai Zhao
Min Zhang
41
84
0
29 Dec 2020
Universal Sentence Representation Learning with Conditional Masked
  Language Model
Universal Sentence Representation Learning with Conditional Masked Language Model
Ziyi Yang
Yinfei Yang
Daniel Cer
Jax Law
Eric F. Darve
SSL
24
57
0
28 Dec 2020
BURT: BERT-inspired Universal Representation from Learning Meaningful
  Segment
BURT: BERT-inspired Universal Representation from Learning Meaningful Segment
Yian Li
Hai Zhao
SSL
21
0
0
28 Dec 2020
TransPose: Keypoint Localization via Transformer
TransPose: Keypoint Localization via Transformer
Sen Yang
Zhibin Quan
Mu Nie
Wankou Yang
ViT
145
263
0
28 Dec 2020
SG-Net: Syntax Guided Transformer for Language Representation
SG-Net: Syntax Guided Transformer for Language Representation
Zhuosheng Zhang
Yuwei Wu
Junru Zhou
Sufeng Duan
Hai Zhao
Rui Wang
51
36
0
27 Dec 2020
ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic
ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic
Muhammad Abdul-Mageed
AbdelRahim Elmadany
El Moatez Billah Nagoudi
VLM
62
451
0
27 Dec 2020
Towards a Universal Continuous Knowledge Base
Towards a Universal Continuous Knowledge Base
Gang Chen
Maosong Sun
Yang Liu
28
3
0
25 Dec 2020
Previous
123...464748...575859
Next