ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSL
    AIMat
ArXivPDFHTML

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,920 papers shown
Title
Decomposing Complex Questions Makes Multi-Hop QA Easier and More
  Interpretable
Decomposing Complex Questions Makes Multi-Hop QA Easier and More Interpretable
Ruiliu Fu
Han Wang
Xuejun Zhang
Jun Zhou
Yonghong Yan
ReLM
22
32
0
26 Oct 2021
Understanding the Role of Self-Supervised Learning in
  Out-of-Distribution Detection Task
Understanding the Role of Self-Supervised Learning in Out-of-Distribution Detection Task
Jiuhai Chen
Chen Zhu
Bin Dai
OODD
54
3
0
26 Oct 2021
The Efficiency Misnomer
The Efficiency Misnomer
Daoyuan Chen
Liuyi Yao
Dawei Gao
Ashish Vaswani
Yaliang Li
62
100
0
25 Oct 2021
CLLD: Contrastive Learning with Label Distance for Text Classification
CLLD: Contrastive Learning with Label Distance for Text Classification
Jinhe Lan
Qingyuan Zhan
Chenhao Jiang
Kunping Yuan
Desheng Wang
VLM
39
2
0
25 Oct 2021
Alignment Attention by Matching Key and Query Distributions
Alignment Attention by Matching Key and Query Distributions
Shujian Zhang
Xinjie Fan
Huangjie Zheng
Korawat Tanwisuth
Mingyuan Zhou
OOD
47
10
0
25 Oct 2021
ListReader: Extracting List-form Answers for Opinion Questions
ListReader: Extracting List-form Answers for Opinion Questions
Peng Cui
Dongyao Hu
Le Hu
RALM
21
2
0
22 Oct 2021
Vis-TOP: Visual Transformer Overlay Processor
Vis-TOP: Visual Transformer Overlay Processor
Wei Hu
Dian Xu
Zimeng Fan
Fang Liu
Yanxiang He
BDL
ViT
59
5
0
21 Oct 2021
Overview of the 2021 Key Point Analysis Shared Task
Overview of the 2021 Key Point Analysis Shared Task
Roni Friedman
Lena Dankin
Yufang Hou
R. Aharonov
Yoav Katz
Noam Slonim
24
22
0
20 Oct 2021
Ranking and Tuning Pre-trained Models: A New Paradigm for Exploiting
  Model Hubs
Ranking and Tuning Pre-trained Models: A New Paradigm for Exploiting Model Hubs
Kaichao You
Yong Liu
Ziyang Zhang
Jianmin Wang
Michael I. Jordan
Mingsheng Long
124
32
0
20 Oct 2021
Interpreting Deep Learning Models in Natural Language Processing: A
  Review
Interpreting Deep Learning Models in Natural Language Processing: A Review
Xiaofei Sun
Diyi Yang
Xiaoya Li
Tianwei Zhang
Yuxian Meng
Han Qiu
Guoyin Wang
Eduard H. Hovy
Jiwei Li
26
45
0
20 Oct 2021
JavaBERT: Training a transformer-based model for the Java programming
  language
JavaBERT: Training a transformer-based model for the Java programming language
Nelson Tavares de Sousa
Wilhelm Hasselbring
11
12
0
20 Oct 2021
Distributionally Robust Classifiers in Sentiment Analysis
Distributionally Robust Classifiers in Sentiment Analysis
Shilun Li
Renee Li
Carina Zhang
OOD
9
0
0
20 Oct 2021
Ensemble ALBERT on SQuAD 2.0
Ensemble ALBERT on SQuAD 2.0
Shilun Li
Renee Li
Veronica Peng
MoE
19
6
0
19 Oct 2021
Comparing Deep Neural Nets with UMAP Tour
Comparing Deep Neural Nets with UMAP Tour
Mingwei Li
C. Scheidegger
FAtt
39
1
0
18 Oct 2021
Contextual Hate Speech Detection in Code Mixed Text using Transformer
  Based Approaches
Contextual Hate Speech Detection in Code Mixed Text using Transformer Based Approaches
Ravindra Nayak
Raviraj Joshi
30
20
0
18 Oct 2021
Energon: Towards Efficient Acceleration of Transformers Using Dynamic
  Sparse Attention
Energon: Towards Efficient Acceleration of Transformers Using Dynamic Sparse Attention
Zhe Zhou
Junling Liu
Zhenyu Gu
Guangyu Sun
64
43
0
18 Oct 2021
Deep Transfer Learning & Beyond: Transformer Language Models in
  Information Systems Research
Deep Transfer Learning & Beyond: Transformer Language Models in Information Systems Research
Ross Gruetzemacher
D. Paradice
41
31
0
18 Oct 2021
Quantifying the Task-Specific Information in Text-Based Classifications
Quantifying the Task-Specific Information in Text-Based Classifications
Zining Zhu
Aparna Balagopalan
Marzyeh Ghassemi
Frank Rudzicz
49
4
0
17 Oct 2021
Analyzing Dynamic Adversarial Training Data in the Limit
Analyzing Dynamic Adversarial Training Data in the Limit
Eric Wallace
Adina Williams
Robin Jia
Douwe Kiela
202
30
0
16 Oct 2021
Leveraging Knowledge in Multilingual Commonsense Reasoning
Leveraging Knowledge in Multilingual Commonsense Reasoning
Yuwei Fang
Shuohang Wang
Yichong Xu
Ruochen Xu
Siqi Sun
Chenguang Zhu
Michael Zeng
LRM
245
17
0
16 Oct 2021
Knowledge Enhanced Pretrained Language Models: A Compreshensive Survey
Knowledge Enhanced Pretrained Language Models: A Compreshensive Survey
Xiaokai Wei
Shen Wang
Dejiao Zhang
Parminder Bhatia
Andrew O. Arnold
KELM
36
46
0
16 Oct 2021
Detecting Gender Bias in Transformer-based Models: A Case Study on BERT
Detecting Gender Bias in Transformer-based Models: A Case Study on BERT
Bingbing Li
Hongwu Peng
Rajat Sainju
Junhuan Yang
Lei Yang
Yueying Liang
Weiwen Jiang
Binghui Wang
Hang Liu
Caiwen Ding
32
12
0
15 Oct 2021
The World of an Octopus: How Reporting Bias Influences a Language
  Model's Perception of Color
The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color
Cory Paik
Stéphane Aroca-Ouellette
Alessandro Roncone
Katharina Kann
16
35
0
15 Oct 2021
mLUKE: The Power of Entity Representations in Multilingual Pretrained
  Language Models
mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models
Ryokan Ri
Ikuya Yamada
Yoshimasa Tsuruoka
41
30
0
15 Oct 2021
Structural Characterization for Dialogue Disentanglement
Structural Characterization for Dialogue Disentanglement
Xinbei Ma
Zhuosheng Zhang
Hai Zhao
28
16
0
15 Oct 2021
Transformer-based Multi-task Learning for Disaster Tweet Categorisation
Transformer-based Multi-task Learning for Disaster Tweet Categorisation
Congcong Wang
P. Nulty
David Lillis
44
15
0
15 Oct 2021
Tracing Origins: Coreference-aware Machine Reading Comprehension
Tracing Origins: Coreference-aware Machine Reading Comprehension
Baorong Huang
Zhuosheng Zhang
Hai Zhao
44
5
0
15 Oct 2021
SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer
SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer
Tu Vu
Brian Lester
Noah Constant
Rami Al-Rfou
Daniel Cer
VLM
LRM
145
279
0
15 Oct 2021
Fusing Heterogeneous Factors with Triaffine Mechanism for Nested Named
  Entity Recognition
Fusing Heterogeneous Factors with Triaffine Mechanism for Nested Named Entity Recognition
Zheng Yuan
Chuanqi Tan
Songfang Huang
Fei Huang
81
46
0
14 Oct 2021
Building Chinese Biomedical Language Models via Multi-Level Text
  Discrimination
Building Chinese Biomedical Language Models via Multi-Level Text Discrimination
Quan Wang
Songtai Dai
Benfeng Xu
Yajuan Lyu
Yong Zhu
Hua Wu
Haifeng Wang
71
14
0
14 Oct 2021
Transformer over Pre-trained Transformer for Neural Text Segmentation
  with Enhanced Topic Coherence
Transformer over Pre-trained Transformer for Neural Text Segmentation with Enhanced Topic Coherence
Kelvin Lo
Yuan Jin
Weicong Tan
Ming Liu
Lan Du
Wray Buntine
24
39
0
14 Oct 2021
bert2BERT: Towards Reusable Pretrained Language Models
bert2BERT: Towards Reusable Pretrained Language Models
Cheng Chen
Yichun Yin
Lifeng Shang
Xin Jiang
Yujia Qin
Fengyu Wang
Zhi Wang
Xiao Chen
Zhiyuan Liu
Qun Liu
VLM
34
60
0
14 Oct 2021
Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text
  Style Transfer
Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer
Fanchao Qi
Yangyi Chen
Xurui Zhang
Mukai Li
Zhiyuan Liu
Maosong Sun
AAML
SILM
84
177
0
14 Oct 2021
Towards Efficient NLP: A Standard Evaluation and A Strong Baseline
Towards Efficient NLP: A Standard Evaluation and A Strong Baseline
Xiangyang Liu
Tianxiang Sun
Junliang He
Jiawen Wu
Lingling Wu
Xinyu Zhang
Hao Jiang
Bo Zhao
Xuanjing Huang
Xipeng Qiu
ELM
33
46
0
13 Oct 2021
Automated Essay Scoring Using Transformer Models
Automated Essay Scoring Using Transformer Models
Sabrina Ludwig
Christian W. F. Mayer
Christopher Hansen
Kerstin Eilers
Steffen Brandt
27
40
0
13 Oct 2021
Leveraging redundancy in attention with Reuse Transformers
Leveraging redundancy in attention with Reuse Transformers
Srinadh Bhojanapalli
Ayan Chakrabarti
Andreas Veit
Michal Lukasik
Himanshu Jain
Frederick Liu
Yin-Wen Chang
Sanjiv Kumar
31
23
0
13 Oct 2021
Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese
Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese
Zhuosheng Zhang
Hanqing Zhang
Keming Chen
Yuhang Guo
Jingyun Hua
Yulong Wang
Ming Zhou
VLM
59
71
0
13 Oct 2021
EventBERT: A Pre-Trained Model for Event Correlation Reasoning
EventBERT: A Pre-Trained Model for Event Correlation Reasoning
Yucheng Zhou
Xiubo Geng
Tao Shen
Guodong Long
Daxin Jiang
44
48
0
13 Oct 2021
MMIU: Dataset for Visual Intent Understanding in Multimodal Assistants
MMIU: Dataset for Visual Intent Understanding in Multimodal Assistants
Alkesh Patel
Joel Ruben Antony Moniz
R. Nguyen
Nicholas Tzou
Hadas Kotek
Vincent Renkens
VGen
21
1
0
13 Oct 2021
Attention-guided Generative Models for Extractive Question Answering
Attention-guided Generative Models for Extractive Question Answering
Peng Xu
Davis Liang
Zhiheng Huang
Bing Xiang
43
18
0
12 Oct 2021
Exploring Wav2vec 2.0 fine-tuning for improved speech emotion
  recognition
Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition
Li-Wei Chen
Alexander I. Rudnicky
VLM
33
124
0
12 Oct 2021
Rome was built in 1776: A Case Study on Factual Correctness in
  Knowledge-Grounded Response Generation
Rome was built in 1776: A Case Study on Factual Correctness in Knowledge-Grounded Response Generation
Sashank Santhanam
Behnam Hedayatnia
Spandana Gella
Aishwarya Padmakumar
Seokhwan Kim
Yang Liu
Dilek Z. Hakkani-Tür
49
35
0
11 Oct 2021
Pre-trained Language Models in Biomedical Domain: A Systematic Survey
Pre-trained Language Models in Biomedical Domain: A Systematic Survey
Benyou Wang
Qianqian Xie
Jiahuan Pei
Zhihong Chen
Prayag Tiwari
Zhao Li
Jie Fu
LM&MA
AI4CE
42
166
0
11 Oct 2021
Advances in Multi-turn Dialogue Comprehension: A Survey
Zhuosheng Zhang
Hai Zhao
47
21
0
11 Oct 2021
CoRGi: Content-Rich Graph Neural Networks with Attention
CoRGi: Content-Rich Graph Neural Networks with Attention
Jooyeon Kim
A. Lamb
Simon Woodhead
Simon L. Peyton Jones
Cheng Zheng
Miltiadis Allamanis
44
6
0
10 Oct 2021
A Framework for Rationale Extraction for Deep QA models
A Framework for Rationale Extraction for Deep QA models
Sahana Ramnath
Preksha Nema
Deep Sahni
Mitesh M. Khapra
AAML
FAtt
30
0
0
09 Oct 2021
Paperswithtopic: Topic Identification from Paper Title Only
Paperswithtopic: Topic Identification from Paper Title Only
Daehyun Cho
C. Wallraven
29
0
0
09 Oct 2021
Local and Global Context-Based Pairwise Models for Sentence Ordering
Local and Global Context-Based Pairwise Models for Sentence Ordering
R. Manku
A. Paul
47
3
0
08 Oct 2021
Active learning for interactive satellite image change detection
Active learning for interactive satellite image change detection
H. Sahbi
Sebastien Deschamps
Andrei Stoian
53
6
0
08 Oct 2021
M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion
  Parameter Pretraining
M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining
Junyang Lin
An Yang
Jinze Bai
Chang Zhou
Le Jiang
...
Jie Zhang
Yong Li
Wei Lin
Jingren Zhou
Hongxia Yang
MoE
92
43
0
08 Oct 2021
Previous
123...353637...575859
Next