Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.11942
Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"
50 / 2,912 papers shown
Title
DynRank: Improving Passage Retrieval with Dynamic Zero-Shot Prompting Based on Question Classification
Abdelrahman Abdallah
Jamshid Mozafari
Bhawna Piryani
Mohammed M. Abdelgwad
Adam Jatowt
74
1
0
30 Nov 2024
Turing Representational Similarity Analysis (RSA): A Flexible Method for Measuring Alignment Between Human and Artificial Intelligence
Mattson Ogg
Ritwik Bose
Jamie Scharf
Christopher Ratto
Michael Wolmetz
91
2
0
30 Nov 2024
Can bidirectional encoder become the ultimate winner for downstream applications of foundation models?
Lewen Yang
Xuanyu Zhou
Juao Fan
Xinyi Xie
Shengxin Zhu
AI4CE
64
0
0
27 Nov 2024
What Differentiates Educational Literature? A Multimodal Fusion Approach of Transformers and Computational Linguistics
Jordan J. Bird
68
0
0
26 Nov 2024
Dynamic Self-Distillation via Previous Mini-batches for Fine-tuning Small Language Models
Y. Fu
Yin Yu
Xiaotian Han
Runchao Li
Xianxuan Long
Haotian Yu
Pan Li
SyDa
67
0
0
25 Nov 2024
Profiling Bias in LLMs: Stereotype Dimensions in Contextual Word Embeddings
Carolin M. Schuster
Maria-Alexandra Dinisor
Shashwat Ghatiwala
Georg Groh
79
1
0
25 Nov 2024
A Comparative Analysis of Transformer and LSTM Models for Detecting Suicidal Ideation on Reddit
Khalid Hasan
Jamil Saquer
AI4MH
70
1
0
23 Nov 2024
FLARE: FP-Less PTQ and Low-ENOB ADC Based AMS-PiM for Error-Resilient, Fast, and Efficient Transformer Acceleration
Donghyeon Yi
Seoyoung Lee
Jongho Kim
Junyoung Kim
Sohmyung Ha
Ik Joon Chang
Minkyu Je
72
0
0
22 Nov 2024
BERT-Based Approach for Automating Course Articulation Matrix Construction with Explainable AI
Natenaile Asmamaw Shiferaw
Simpenzwe Honore Leandre
Aman Sinha
Dillip Rout
60
0
0
21 Nov 2024
Forecasting Future International Events: A Reliable Dataset for Text-Based Event Modeling
Daehoon Gwak
Junwoo Park
Minho Park
C. Park
Hyunchan Lee
E. Choi
Jaegul Choo
76
0
0
21 Nov 2024
Mitigating Gender Bias in Contextual Word Embeddings
Navya Yarrabelly
Vinay Damodaran
Feng-Guang Su
67
0
0
18 Nov 2024
New Emerged Security and Privacy of Pre-trained Model: a Survey and Outlook
Meng Yang
Tianqing Zhu
Chi Liu
Wanlei Zhou
Shui Yu
Philip S. Yu
AAML
ELM
PILM
61
1
0
12 Nov 2024
Clustering in Causal Attention Masking
Nikita Karagodin
Yury Polyanskiy
Philippe Rigollet
60
5
0
07 Nov 2024
PSformer: Parameter-efficient Transformer with Segment Attention for Time Series Forecasting
Yanlong Wang
J. Xu
Fei Ma
Shao-Lun Huang
Danny Dongning Sun
Xiao-Ping Zhang
AI4TS
45
1
0
03 Nov 2024
Human-inspired Perspectives: A Survey on AI Long-term Memory
Zihong He
Weizhe Lin
Hao Zheng
Fan Zhang
Matt Jones
Laurence Aitchison
X. Xu
Miao Liu
Per Ola Kristensson
Junxiao Shen
77
2
0
01 Nov 2024
ProTransformer: Robustify Transformers via Plug-and-Play Paradigm
Zhichao Hou
Weizhi Gao
Yuchen Shen
Feiyi Wang
Xiaorui Liu
VLM
30
2
0
30 Oct 2024
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
Sangmin Bae
Adam Fisch
Hrayr Harutyunyan
Ziwei Ji
Seungyeon Kim
Tal Schuster
KELM
81
5
0
28 Oct 2024
Ensembling Finetuned Language Models for Text Classification
Sebastian Pineda Arango
Maciej Janowski
Lennart Purucker
Arber Zela
Frank Hutter
Josif Grabocka
33
0
0
25 Oct 2024
Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques
David Ortiz-Perez
Manuel Benavent-Lledo
José García Rodríguez
David Tomás
M. Flores Vizcaya-Moreno
34
0
0
24 Oct 2024
MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers
Zebin Yang
Renze Chen
Taiqiang Wu
Ngai Wong
Yun Liang
Runsheng Wang
R. Huang
Meng Li
MQ
27
1
0
23 Oct 2024
Quantifying the Risks of Tool-assisted Rephrasing to Linguistic Diversity
Mengying Wang
Andreas Spitz
23
0
0
23 Oct 2024
Acoustic Model Optimization over Multiple Data Sources: Merging and Valuation
Victor Junqiu Wei
Weicheng Wang
Di Jiang
Conghui Tan
Rongzhong Lian
MoMe
30
0
0
21 Oct 2024
Causality for Large Language Models
Anpeng Wu
Kun Kuang
Minqin Zhu
Yingrong Wang
Yujia Zheng
Kairong Han
Yangqiu Song
Guangyi Chen
Fei Wu
Anton van den Hengel
LRM
46
7
0
20 Oct 2024
Fine-Tuning Pre-trained Language Models for Robust Causal Representation Learning
Jialin Yu
Yuxiang Zhou
Yulan He
Nevin L. Zhang
Ricardo Silva
33
0
0
18 Oct 2024
Pseudo-label Refinement for Improving Self-Supervised Learning Systems
Zia-ur-Rehman
Arif Mahmood
Wenxiong Kang
33
1
0
18 Oct 2024
From Babbling to Fluency: Evaluating the Evolution of Language Models in Terms of Human Language Acquisition
Qiyuan Yang
Pengda Wang
Luke D. Plonsky
Frederick L. Oswald
Hanjie Chen
ELM
23
2
0
17 Oct 2024
Unitary Multi-Margin BERT for Robust Natural Language Processing
Hao-Yuan Chang
Kang L. Wang
AAML
21
0
0
16 Oct 2024
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression
Zhenheng Tang
Xueze Kang
Yiming Yin
Xinglin Pan
Yuxin Wang
...
Shaohuai Shi
Amelie Chi Zhou
Bo Li
Bingsheng He
Xiaowen Chu
AI4CE
78
8
0
16 Oct 2024
Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models
Kai Yao
P. Gao
Lichun Li
Yuan Zhao
Xiaofeng Wang
Wei Wang
Jianke Zhu
26
1
0
15 Oct 2024
TSDS: Data Selection for Task-Specific Model Finetuning
Zifan Liu
Amin Karbasi
Theodoros Rekatsinas
31
4
0
15 Oct 2024
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
Federico Arangath Joseph
Jerome Sieber
M. Zeilinger
Carmen Amo Alonso
33
0
0
14 Oct 2024
Arrhythmia Classification Using Graph Neural Networks Based on Correlation Matrix
Seungwoo Han
18
0
0
14 Oct 2024
Mental Disorders Detection in the Era of Large Language Models
Gleb Kuzmin
Petr Strepetov
Maksim Stankevich
Artem Shelmanov
Ivan Smirnov
31
1
0
09 Oct 2024
Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy
Tong Wu
Shujian Zhang
Kaiqiang Song
Silei Xu
Sanqiang Zhao
Ravi Agrawal
Sathish Indurthi
Chong Xiang
Prateek Mittal
Wenxuan Zhou
42
8
0
09 Oct 2024
Towards the generation of hierarchical attack models from cybersecurity vulnerabilities using language models
Kacper Sowka
Vasile Palade
Xiaorui Jiang
Hesam Jadidbonab
21
1
0
07 Oct 2024
Computational design of target-specific linear peptide binders with TransformerBeta
Haowen Zhao
Francesco A. Aprile
Barbara Bravi
31
0
0
07 Oct 2024
Dynamic Post-Hoc Neural Ensemblers
Sebastian Pineda Arango
Maciej Janowski
Lennart Purucker
Arber Zela
Frank Hutter
Josif Grabocka
UQCV
36
0
0
06 Oct 2024
Variational Language Concepts for Interpreting Foundation Language Models
Hengyi Wang
Shiwei Tan
Zhiqing Hong
Desheng Zhang
Hao Wang
34
3
0
04 Oct 2024
Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding
Wei Yu Wu
Chao Wang
Liyi Chen
Mingze Yin
Yiheng Zhu
Kun Fu
Jieping Ye
Hui Xiong
Zheng Wang
30
1
0
04 Oct 2024
Demystifying the Token Dynamics of Deep Selective State Space Models
Thieu N. Vo
Tung D. Pham
Xin T. Tong
Tan Minh Nguyen
Mamba
49
0
0
04 Oct 2024
Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor Factorization for Compression of Generative Language Models
Mingxue Xu
Sadia Sharmin
Danilo P. Mandic
29
2
0
03 Oct 2024
Morphological evaluation of subwords vocabulary used by BETO language model
Óscar García-Sierra
Ana Fernández-Pampillón Cesteros
Miguel Ortega-Martín
36
0
0
03 Oct 2024
DeIDClinic: A Multi-Layered Framework for De-identification of Clinical Free-text Data
Angel Paul
Dhivin Shaji
Lifeng Han
Warren Del-Pinto
Goran Nenadic
OOD
34
0
0
02 Oct 2024
DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models
Yuxuan Zhang
Ruizhe Li
MoMe
55
0
0
02 Oct 2024
On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding
Kevin Xu
Issei Sato
39
3
0
02 Oct 2024
Depression detection in social media posts using transformer-based models and auxiliary features
Marios Kerasiotis
Loukas Ilias
D. Askounis
23
4
0
30 Sep 2024
FINE: Factorizing Knowledge for Initialization of Variable-sized Diffusion Models
Yucheng Xie
Fu Feng
Ruixiao Shi
Jing Wang
Xin Geng
AI4CE
36
2
0
28 Sep 2024
On the Inductive Bias of Stacking Towards Improving Reasoning
Nikunj Saunshi
Stefani Karp
Shankar Krishnan
Sobhan Miryoosefi
Sashank J. Reddi
Sanjiv Kumar
LRM
AI4CE
36
4
0
27 Sep 2024
Meta-RTL: Reinforcement-Based Meta-Transfer Learning for Low-Resource Commonsense Reasoning
Yu Fu
Jie He
Yifan Yang
Qun Liu
Deyi Xiong
OffRL
LRM
45
0
0
27 Sep 2024
DisGeM: Distractor Generation for Multiple Choice Questions with Span Masking
Devrim Cavusoglu
Secil Sen
Ulas Sert
29
0
0
26 Sep 2024
Previous
1
2
3
4
5
6
...
57
58
59
Next