ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations
v1v2v3v4v5v6 (latest)

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSLAIMat
ArXiv (abs)PDFHTMLGithub (3271★)

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,935 papers shown
Title
Unitary Multi-Margin BERT for Robust Natural Language Processing
Unitary Multi-Margin BERT for Robust Natural Language Processing
Hao-Yuan Chang
Kang L. Wang
AAML
49
0
0
16 Oct 2024
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs
  with Adaptive Compression
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression
Zhenheng Tang
Xueze Kang
Yiming Yin
Xinglin Pan
Yuxin Wang
...
Shaohuai Shi
Amelie Chi Zhou
Bo Li
Bingsheng He
Xiaowen Chu
AI4CE
109
8
0
16 Oct 2024
Layer-wise Importance Matters: Less Memory for Better Performance in
  Parameter-efficient Fine-tuning of Large Language Models
Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models
Kai Yao
P. Gao
Lichun Li
Yuan Zhao
Xiaofeng Wang
Wei Wang
Jianke Zhu
56
2
0
15 Oct 2024
TSDS: Data Selection for Task-Specific Model Finetuning
TSDS: Data Selection for Task-Specific Model Finetuning
Zifan Liu
Amin Karbasi
Theodoros Rekatsinas
78
6
0
15 Oct 2024
Arrhythmia Classification Using Graph Neural Networks Based on Correlation Matrix
Arrhythmia Classification Using Graph Neural Networks Based on Correlation Matrix
Seungwoo Han
72
0
0
14 Oct 2024
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
Federico Arangath Joseph
Jerome Sieber
Melanie Zeilinger
Carmen Amo Alonso
222
0
0
14 Oct 2024
Mental Disorders Detection in the Era of Large Language Models
Mental Disorders Detection in the Era of Large Language Models
Gleb Kuzmin
Petr Strepetov
Maksim Stankevich
Artem Shelmanov
Ivan Smirnov
43
1
0
09 Oct 2024
Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy
Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy
Tong Wu
Shujian Zhang
Kaiqiang Song
Silei Xu
Sanqiang Zhao
Ravi Agrawal
Sathish Indurthi
Chong Xiang
Prateek Mittal
Wenxuan Zhou
112
14
0
09 Oct 2024
Towards the generation of hierarchical attack models from cybersecurity
  vulnerabilities using language models
Towards the generation of hierarchical attack models from cybersecurity vulnerabilities using language models
Kacper Sowka
Vasile Palade
Xiaorui Jiang
Hesam Jadidbonab
87
1
0
07 Oct 2024
Computational design of target-specific linear peptide binders with
  TransformerBeta
Computational design of target-specific linear peptide binders with TransformerBeta
Haowen Zhao
Francesco A. Aprile
Barbara Bravi
77
0
0
07 Oct 2024
Regularized Neural Ensemblers
Regularized Neural Ensemblers
Sebastian Pineda Arango
Maciej Janowski
Lennart Purucker
Arber Zela
Frank Hutter
Josif Grabocka
UQCV
95
0
0
06 Oct 2024
Variational Language Concepts for Interpreting Foundation Language
  Models
Variational Language Concepts for Interpreting Foundation Language Models
Hengyi Wang
Shiwei Tan
Zhiqing Hong
Desheng Zhang
Hao Wang
144
3
0
04 Oct 2024
Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding with LLMs
Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding with LLMs
Wei Wu
Chao Wang
L. Chen
Mingze Yin
Yiheng Zhu
Kun Fu
Jieping Ye
Hui Xiong
Zheng Wang
141
1
0
04 Oct 2024
Demystifying the Token Dynamics of Deep Selective State Space Models
Demystifying the Token Dynamics of Deep Selective State Space Models
Thieu N. Vo
Tung D. Pham
Xin T. Tong
Tan Minh Nguyen
Mamba
131
0
0
04 Oct 2024
Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor
  Factorization for Compression of Generative Language Models
Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor Factorization for Compression of Generative Language Models
Mingxue Xu
Sadia Sharmin
Danilo Mandic
72
2
0
03 Oct 2024
Morphological evaluation of subwords vocabulary used by BETO language
  model
Morphological evaluation of subwords vocabulary used by BETO language model
Óscar García-Sierra
Ana Fernández-Pampillón Cesteros
Miguel Ortega-Martín
61
0
0
03 Oct 2024
DeIDClinic: A Multi-Layered Framework for De-identification of Clinical
  Free-text Data
DeIDClinic: A Multi-Layered Framework for De-identification of Clinical Free-text Data
Angel Paul
Dhivin Shaji
Lifeng Han
Warren Del-Pinto
Goran Nenadic
OOD
62
0
0
02 Oct 2024
DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models
DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models
Yuxuan Zhang
Ruizhe Li
MoMe
199
2
0
02 Oct 2024
On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding
On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding
Kevin Xu
Issei Sato
120
4
0
02 Oct 2024
Depression detection in social media posts using transformer-based
  models and auxiliary features
Depression detection in social media posts using transformer-based models and auxiliary features
Marios Kerasiotis
Loukas Ilias
D. Askounis
79
6
0
30 Sep 2024
FINE: Factorizing Knowledge for Initialization of Variable-sized
  Diffusion Models
FINE: Factorizing Knowledge for Initialization of Variable-sized Diffusion Models
Yucheng Xie
Fu Feng
Ruixiao Shi
Jing Wang
Xin Geng
AI4CE
70
3
0
28 Sep 2024
On the Inductive Bias of Stacking Towards Improving Reasoning
On the Inductive Bias of Stacking Towards Improving Reasoning
Nikunj Saunshi
Stefani Karp
Shankar Krishnan
Sobhan Miryoosefi
Sashank J. Reddi
Sanjiv Kumar
LRMAI4CE
86
7
0
27 Sep 2024
Meta-RTL: Reinforcement-Based Meta-Transfer Learning for Low-Resource Commonsense Reasoning
Meta-RTL: Reinforcement-Based Meta-Transfer Learning for Low-Resource Commonsense Reasoning
Yu Fu
Jie He
Yifan Yang
Qun Liu
Deyi Xiong
OffRLLRM
105
0
0
27 Sep 2024
DisGeM: Distractor Generation for Multiple Choice Questions with Span
  Masking
DisGeM: Distractor Generation for Multiple Choice Questions with Span Masking
Devrim Cavusoglu
Secil Sen
Ulas Sert
63
0
0
26 Sep 2024
Integrating Hierarchical Semantic into Iterative Generation Model for
  Entailment Tree Explanation
Integrating Hierarchical Semantic into Iterative Generation Model for Entailment Tree Explanation
Qin Wang
Jianzhou Feng
Yiming Xu
55
0
0
26 Sep 2024
SimVG: A Simple Framework for Visual Grounding with Decoupled
  Multi-modal Fusion
SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion
Ming Dai
Lingfeng Yang
Yihao Xu
Zhenhua Feng
Wankou Yang
ObjD
125
13
0
26 Sep 2024
Pre-trained Language Models Return Distinguishable Probability
  Distributions to Unfaithfully Hallucinated Texts
Pre-trained Language Models Return Distinguishable Probability Distributions to Unfaithfully Hallucinated Texts
Taehun Cha
Donghun Lee
HILM
62
1
0
25 Sep 2024
dnaGrinder: a lightweight and high-capacity genomic foundation model
dnaGrinder: a lightweight and high-capacity genomic foundation model
Qihang Zhao
Chi Zhang
Weixiong Zhang
53
0
0
24 Sep 2024
ToxiCraft: A Novel Framework for Synthetic Generation of Harmful Information
ToxiCraft: A Novel Framework for Synthetic Generation of Harmful Information
Zheng Hui
Zhaoxiao Guo
Hang Zhao
Juanyong Duan
Congrui Huang
145
7
0
23 Sep 2024
Data-centric NLP Backdoor Defense from the Lens of Memorization
Data-centric NLP Backdoor Defense from the Lens of Memorization
Zhenting Wang
Zhizhi Wang
Mingyu Jin
Mengnan Du
Juan Zhai
Shiqing Ma
86
3
0
21 Sep 2024
Normalized Narrow Jump To Conclusions: Normalized Narrow Shortcuts for
  Parameter Efficient Early Exit Transformer Prediction
Normalized Narrow Jump To Conclusions: Normalized Narrow Shortcuts for Parameter Efficient Early Exit Transformer Prediction
Amrit Diggavi Seshadri
50
1
0
21 Sep 2024
FAMOUS: Flexible Accelerator for the Attention Mechanism of Transformer
  on UltraScale+ FPGAs
FAMOUS: Flexible Accelerator for the Attention Mechanism of Transformer on UltraScale+ FPGAs
Ehsan Kabir
Md. Arafat Kabir
Austin R. J. Downey
Jason D. Bakos
David Andrews
Miaoqing Huang
GNN
66
0
0
21 Sep 2024
Profiling Patient Transcript Using Large Language Model Reasoning
  Augmentation for Alzheimer's Disease Detection
Profiling Patient Transcript Using Large Language Model Reasoning Augmentation for Alzheimer's Disease Detection
Chin-Po Chen
Jeng-Lin Li
LM&MA
28
1
0
19 Sep 2024
Evaluation of pretrained language models on music understanding
Evaluation of pretrained language models on music understanding
Yannis Vasilakis
Rachel M. Bittner
Johan Pauwels
99
1
0
17 Sep 2024
OneEncoder: A Lightweight Framework for Progressive Alignment of
  Modalities
OneEncoder: A Lightweight Framework for Progressive Alignment of Modalities
Bilal Faye
Hanane Azzag
M. Lebbah
ObjD
105
0
0
17 Sep 2024
Towards Data-Centric RLHF: Simple Metrics for Preference Dataset
  Comparison
Towards Data-Centric RLHF: Simple Metrics for Preference Dataset Comparison
Judy Hanwen Shen
Archit Sharma
Jun Qin
70
5
0
15 Sep 2024
Deep Fast Machine Learning Utils: A Python Library for Streamlined
  Machine Learning Prototyping
Deep Fast Machine Learning Utils: A Python Library for Streamlined Machine Learning Prototyping
Fabi Prezja
AI4CE
68
0
0
14 Sep 2024
Multi-intent Aware Contrastive Learning for Sequential Recommendation
Multi-intent Aware Contrastive Learning for Sequential Recommendation
Junshu Huang
Zi Long
Xianghua Fu
Yin Chen
HAI
34
0
0
13 Sep 2024
A BERT-Based Summarization approach for depression detection
A BERT-Based Summarization approach for depression detection
Hossein Salahshoor Gavalan
Mohmmad Naim Rastgoo
Bahareh Nakisa
63
2
0
13 Sep 2024
TheraGen: Therapy for Every Generation
TheraGen: Therapy for Every Generation
Kartikey Doshi
Jimit Shah
Narendra Shekokar
AI4MH
53
0
0
12 Sep 2024
Enhancing adversarial robustness in Natural Language Inference using
  explanations
Enhancing adversarial robustness in Natural Language Inference using explanations
Alexandros Koulakos
Maria Lymperaiou
Giorgos Filandrianos
Giorgos Stamou
SILMAAML
132
2
0
11 Sep 2024
DA-MoE: Towards Dynamic Expert Allocation for Mixture-of-Experts Models
DA-MoE: Towards Dynamic Expert Allocation for Mixture-of-Experts Models
Maryam Akhavan Aghdam
Hongpeng Jin
Yanzhao Wu
MoE
63
3
0
10 Sep 2024
DetoxBench: Benchmarking Large Language Models for Multitask Fraud &
  Abuse Detection
DetoxBench: Benchmarking Large Language Models for Multitask Fraud & Abuse Detection
Joymallya Chakraborty
Wei Xia
Anirban Majumder
Dan Ma
Walid Chaabene
Naveed Janvekar
49
3
0
09 Sep 2024
Application Specific Compression of Deep Learning Models
Application Specific Compression of Deep Learning Models
Rohit Raj Rai
Angana Borah
Amit Awekar
48
0
0
09 Sep 2024
Driving with Prior Maps: Unified Vector Prior Encoding for Autonomous
  Vehicle Mapping
Driving with Prior Maps: Unified Vector Prior Encoding for Autonomous Vehicle Mapping
Shuang Zeng
Xinyuan Chang
Xinran Liu
Zheng Pan
Xing Wei
126
3
0
09 Sep 2024
Expanding Expressivity in Transformer Models with MöbiusAttention
Expanding Expressivity in Transformer Models with MöbiusAttention
Anna-Maria Halacheva
M. Nayyeri
Steffen Staab
78
1
0
08 Sep 2024
Achieving Peak Performance for Large Language Models: A Systematic
  Review
Achieving Peak Performance for Large Language Models: A Systematic Review
Z. R. K. Rostam
Sándor Szénási
Gábor Kertész
95
5
0
07 Sep 2024
An Effective Deployment of Diffusion LM for Data Augmentation in
  Low-Resource Sentiment Classification
An Effective Deployment of Diffusion LM for Data Augmentation in Low-Resource Sentiment Classification
Zhuowei Chen
Lianxi Wang
Yuben Wu
Xinfeng Liao
Yujia Tian
Junyang Zhong
DiffM
107
1
0
05 Sep 2024
Pre-Trained Language Models for Keyphrase Prediction: A Review
Pre-Trained Language Models for Keyphrase Prediction: A Review
Muhammad Umair
Tangina Sultana
Young-Koo Lee
80
4
0
02 Sep 2024
From Prediction to Application: Language Model-based Code Knowledge
  Tracing with Domain Adaptive Pre-Training and Automatic Feedback System with
  Pedagogical Prompting for Comprehensive Programming Education
From Prediction to Application: Language Model-based Code Knowledge Tracing with Domain Adaptive Pre-Training and Automatic Feedback System with Pedagogical Prompting for Comprehensive Programming Education
Unggi Lee
Jiyeong Bae
Yeonji Jung
Minji Kang
Gyuri Byun
...
Sookbun Lee
Jaekwon Park
Taekyung Ahn
Gunho Lee
Hyeoncheol Kim
AI4EdKELM
83
1
0
31 Aug 2024
Previous
12345...575859
Next