ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations
v1v2v3v4v5v6 (latest)

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSLAIMat
ArXiv (abs)PDFHTMLGithub (3271★)

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,935 papers shown
Title
RiddleSense: Reasoning about Riddle Questions Featuring Linguistic
  Creativity and Commonsense Knowledge
RiddleSense: Reasoning about Riddle Questions Featuring Linguistic Creativity and Commonsense Knowledge
Bill Yuchen Lin
Ziyi Wu
Yichi Yang
Dong-Ho Lee
Xiang Ren
ReLMLRM
284
68
0
02 Jan 2021
Subformer: Exploring Weight Sharing for Parameter Efficiency in
  Generative Transformers
Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers
Machel Reid
Edison Marrese-Taylor
Y. Matsuo
MoE
108
48
0
01 Jan 2021
BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource
  Language Understanding Evaluation in Bangla
BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla
Abhik Bhattacharjee
Tahmid Hasan
Wasi Uddin Ahmad
Kazi Samin Mubasshir
Md. Saiful Islam
Anindya Iqbal
M. Rahman
Rifat Shahriyar
SSLVLM
101
180
0
01 Jan 2021
Transformer based Automatic COVID-19 Fake News Detection System
Transformer based Automatic COVID-19 Fake News Detection System
Sunil Gundapu
R. Mamidi
91
71
0
01 Jan 2021
NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons
  Learned
NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned
Sewon Min
Jordan L. Boyd-Graber
Chris Alberti
Danqi Chen
Eunsol Choi
...
Dmytro Okhonko
Michael Schlichtkrull
Sonal Gupta
Yashar Mehdad
Wen-tau Yih
81
62
0
01 Jan 2021
WARP: Word-level Adversarial ReProgramming
WARP: Word-level Adversarial ReProgramming
Karen Hambardzumyan
Hrant Khachatrian
Jonathan May
AAML
349
354
0
01 Jan 2021
Towards Modelling Coherence in Spoken Discourse
Towards Modelling Coherence in Spoken Discourse
Rajaswa Patil
Yaman Kumar Singla
R. Shah
Mika Hama
Roger Zimmermann
AuLLM
123
8
0
31 Dec 2020
BinaryBERT: Pushing the Limit of BERT Quantization
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
230
227
0
31 Dec 2020
Better Robustness by More Coverage: Adversarial Training with Mixup
  Augmentation for Robust Fine-tuning
Better Robustness by More Coverage: Adversarial Training with Mixup Augmentation for Robust Fine-tuning
Chenglei Si
Zhengyan Zhang
Fanchao Qi
Zhiyuan Liu
Yasheng Wang
Qun Liu
Maosong Sun
AAMLSILM
113
69
0
31 Dec 2020
Seeing is Knowing! Fact-based Visual Question Answering using Knowledge
  Graph Embeddings
Seeing is Knowing! Fact-based Visual Question Answering using Knowledge Graph Embeddings
Kiran Ramnath
M. Hasegawa-Johnson
66
9
0
31 Dec 2020
CLEAR: Contrastive Learning for Sentence Representation
CLEAR: Contrastive Learning for Sentence Representation
Zhuofeng Wu
Sinong Wang
Jiatao Gu
Madian Khabsa
Fei Sun
Hao Ma
SSL
82
324
0
31 Dec 2020
An Experimental Evaluation of Transformer-based Language Models in the
  Biomedical Domain
An Experimental Evaluation of Transformer-based Language Models in the Biomedical Domain
Paul Grouchy
Shobhit Jain
Michael Liu
Kuhan Wang
Max Tian
Nidhi Arora
Hillary Ngai
Faiza Khan Khattak
Elham Dolatabadi
S. Kocak
LM&MAMedIm
109
4
0
31 Dec 2020
Optimizing Deeper Transformers on Small Datasets
Optimizing Deeper Transformers on Small Datasets
Peng Xu
Dhruv Kumar
Wei Yang
Wenjie Zi
Keyi Tang
Chenyang Huang
Jackie C.K. Cheung
S. Prince
Yanshuai Cao
AI4CE
113
69
0
30 Dec 2020
SemGloVe: Semantic Co-occurrences for GloVe from BERT
SemGloVe: Semantic Co-occurrences for GloVe from BERT
Leilei Gan
Zhiyang Teng
Yue Zhang
Linchao Zhu
Leilei Gan
Yi Yang
60
17
0
30 Dec 2020
Out of Order: How Important Is The Sequential Order of Words in a
  Sentence in Natural Language Understanding Tasks?
Out of Order: How Important Is The Sequential Order of Words in a Sentence in Natural Language Understanding Tasks?
Thang M. Pham
Trung Bui
Long Mai
Anh Totti Nguyen
289
123
0
30 Dec 2020
CMV-BERT: Contrastive multi-vocab pretraining of BERT
Wei-wei Zhu
Daniel Cheung
SSLVLM
72
0
0
29 Dec 2020
Code Summarization with Structure-induced Transformer
Code Summarization with Structure-induced Transformer
Hongqiu Wu
Hai Zhao
Min Zhang
75
88
0
29 Dec 2020
Universal Sentence Representation Learning with Conditional Masked
  Language Model
Universal Sentence Representation Learning with Conditional Masked Language Model
Ziyi Yang
Yinfei Yang
Daniel Cer
Jax Law
Eric F. Darve
SSL
84
58
0
28 Dec 2020
BURT: BERT-inspired Universal Representation from Learning Meaningful
  Segment
BURT: BERT-inspired Universal Representation from Learning Meaningful Segment
Yian Li
Hai Zhao
SSL
37
0
0
28 Dec 2020
TransPose: Keypoint Localization via Transformer
TransPose: Keypoint Localization via Transformer
Sen Yang
Zhibin Quan
Mu Nie
Wankou Yang
ViT
205
270
0
28 Dec 2020
SG-Net: Syntax Guided Transformer for Language Representation
SG-Net: Syntax Guided Transformer for Language Representation
Zhuosheng Zhang
Yuwei Wu
Junru Zhou
Sufeng Duan
Hai Zhao
Rui Wang
125
38
0
27 Dec 2020
ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic
ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic
Muhammad Abdul-Mageed
AbdelRahim Elmadany
El Moatez Billah Nagoudi
VLM
131
465
0
27 Dec 2020
Towards a Universal Continuous Knowledge Base
Towards a Universal Continuous Knowledge Base
Gang Chen
Maosong Sun
Yang Liu
55
3
0
25 Dec 2020
QUACKIE: A NLP Classification Task With Ground Truth Explanations
QUACKIE: A NLP Classification Task With Ground Truth Explanations
Yves Rychener
X. Renard
Djamé Seddah
P. Frossard
Marcin Detyniecki
34
3
0
24 Dec 2020
A Survey on Visual Transformer
A Survey on Visual Transformer
Kai Han
Yunhe Wang
Hanting Chen
Xinghao Chen
Jianyuan Guo
...
Chunjing Xu
Yixing Xu
Zhaohui Yang
Yiman Zhang
Dacheng Tao
ViT
233
2,278
0
23 Dec 2020
Multi-Head Self-Attention with Role-Guided Masks
Multi-Head Self-Attention with Role-Guided Masks
Dongsheng Wang
Casper Hansen
Lucas Chaves Lima
Christian B. Hansen
Maria Maistro
J. Simonsen
Christina Lioma
53
2
0
22 Dec 2020
Confronting Abusive Language Online: A Survey from the Ethical and Human
  Rights Perspective
Confronting Abusive Language Online: A Survey from the Ethical and Human Rights Perspective
S. Kiritchenko
I. Nejadgholi
Kathleen C. Fraser
AILaw
113
89
0
22 Dec 2020
Undivided Attention: Are Intermediate Layers Necessary for BERT?
Undivided Attention: Are Intermediate Layers Necessary for BERT?
S. N. Sridhar
Anthony Sarah
66
15
0
22 Dec 2020
Intrinsic Dimensionality Explains the Effectiveness of Language Model
  Fine-Tuning
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning
Armen Aghajanyan
Luke Zettlemoyer
Sonal Gupta
110
577
1
22 Dec 2020
RealFormer: Transformer Likes Residual Attention
RealFormer: Transformer Likes Residual Attention
Ruining He
Anirudh Ravula
Bhargav Kanagal
Joshua Ainslie
76
110
0
21 Dec 2020
Sub-Linear Memory: How to Make Performers SLiM
Sub-Linear Memory: How to Make Performers SLiM
Valerii Likhosherstov
K. Choromanski
Jared Davis
Xingyou Song
Adrian Weller
68
19
0
21 Dec 2020
A Graph Reasoning Network for Multi-turn Response Selection via
  Customized Pre-training
A Graph Reasoning Network for Multi-turn Response Selection via Customized Pre-training
Yongkang Liu
Shi Feng
Daling Wang
Kaisong Song
Feiliang Ren
Yifei Zhang
LRM
50
21
0
21 Dec 2020
Adaptive Bi-directional Attention: Exploring Multi-Granularity
  Representations for Machine Reading Comprehension
Adaptive Bi-directional Attention: Exploring Multi-Granularity Representations for Machine Reading Comprehension
Nuo Chen
Fenglin Liu
Chenyu You
Peilin Zhou
Yuexian Zou
77
31
0
20 Dec 2020
Exploring Fluent Query Reformulations with Text-to-Text Transformers and
  Reinforcement Learning
Exploring Fluent Query Reformulations with Text-to-Text Transformers and Reinforcement Learning
Jerry Zikun Chen
S. Yu
Haoran Wang
444
5
0
18 Dec 2020
BERT Goes Shopping: Comparing Distributional Models for Product
  Representations
BERT Goes Shopping: Comparing Distributional Models for Product Representations
Federico Bianchi
Bingqing Yu
Jacopo Tagliabue
60
15
0
17 Dec 2020
MASKER: Masked Keyword Regularization for Reliable Text Classification
MASKER: Masked Keyword Regularization for Reliable Text Classification
S. Moon
Sangwoo Mo
Kimin Lee
Jaeho Lee
Jinwoo Shin
120
38
0
17 Dec 2020
Costs to Consider in Adopting NLP for Your Business
Costs to Consider in Adopting NLP for Your Business
Made Nindyatama Nityasya
Haryo Akbarianto Wibowo
Radityo Eko Prasojo
Alham Fikri Aji
VLM
48
3
0
16 Dec 2020
A Lightweight Neural Model for Biomedical Entity Linking
A Lightweight Neural Model for Biomedical Entity Linking
Lihu Chen
Gaël Varoquaux
Fabian M. Suchanek
MedIm
62
32
0
16 Dec 2020
Pre-Training Transformers as Energy-Based Cloze Models
Pre-Training Transformers as Energy-Based Cloze Models
Kevin Clark
Minh-Thang Luong
Quoc V. Le
Christopher D. Manning
77
80
0
15 Dec 2020
StackRec: Efficient Training of Very Deep Sequential Recommender Models
  by Iterative Stacking
StackRec: Efficient Training of Very Deep Sequential Recommender Models by Iterative Stacking
Jiachun Wang
Fajie Yuan
Jian Chen
Qingyao Wu
Min Yang
Yang Sun
Guoxiao Zhang
BDL
99
26
0
14 Dec 2020
Parameter-Efficient Transfer Learning with Diff Pruning
Parameter-Efficient Transfer Learning with Diff Pruning
Demi Guo
Alexander M. Rush
Yoon Kim
92
406
0
14 Dec 2020
MiniVLM: A Smaller and Faster Vision-Language Model
MiniVLM: A Smaller and Faster Vision-Language Model
Jianfeng Wang
Xiaowei Hu
Pengchuan Zhang
Xiujun Li
Lijuan Wang
Lefei Zhang
Jianfeng Gao
Zicheng Liu
VLMMLLM
133
60
0
13 Dec 2020
Reinforced Multi-Teacher Selection for Knowledge Distillation
Reinforced Multi-Teacher Selection for Knowledge Distillation
Fei Yuan
Linjun Shou
J. Pei
Wutao Lin
Ming Gong
Yan Fu
Daxin Jiang
71
124
0
11 Dec 2020
Improving Task-Agnostic BERT Distillation with Layer Mapping Search
Improving Task-Agnostic BERT Distillation with Layer Mapping Search
Xiaoqi Jiao
Huating Chang
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
Fang Wang
Qun Liu
49
12
0
11 Dec 2020
GNN-XML: Graph Neural Networks for Extreme Multi-label Text
  Classification
GNN-XML: Graph Neural Networks for Extreme Multi-label Text Classification
Daoming Zong
Shiliang Sun
41
9
0
10 Dec 2020
Know Your Limits: Uncertainty Estimation with ReLU Classifiers Fails at
  Reliable OOD Detection
Know Your Limits: Uncertainty Estimation with ReLU Classifiers Fails at Reliable OOD Detection
Dennis Ulmer
Giovanni Cina
OODD
141
33
0
09 Dec 2020
Label Confusion Learning to Enhance Text Classification Models
Label Confusion Learning to Enhance Text Classification Models
Biyang Guo
Songqiao Han
Xiao Han
Hailiang Huang
Ting Lu
108
69
0
09 Dec 2020
Fusing Context Into Knowledge Graph for Commonsense Question Answering
Fusing Context Into Knowledge Graph for Commonsense Question Answering
Yichong Xu
Chenguang Zhu
Ruochen Xu
Yang Liu
Michael Zeng
Xuedong Huang
82
72
0
09 Dec 2020
Unsupervised Label Refinement Improves Dataless Text Classification
Unsupervised Label Refinement Improves Dataless Text Classification
Zewei Chu
K. Stratos
Kevin Gimpel
67
15
0
08 Dec 2020
Parameter Efficient Multimodal Transformers for Video Representation
  Learning
Parameter Efficient Multimodal Transformers for Video Representation Learning
Sangho Lee
Youngjae Yu
Gunhee Kim
Thomas Breuel
Jan Kautz
Yale Song
ViT
104
78
0
08 Dec 2020
Previous
123...474849...575859
Next