v1v2v3v4v5v6 (latest)

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019

ArXiv (abs)PDF HTML Github (3271★)

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,935 papers shown

Title
Unitary Multi-Margin BERT for Robust Natural Language Processing Hao-Yuan Chang Kang L. Wang AAML 49 0 0 16 Oct 2024
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression Zhenheng Tang Xueze Kang Yiming Yin Xinglin Pan Yuxin Wang ... Shaohuai Shi Amelie Chi Zhou Bo Li Bingsheng He Xiaowen Chu AI4CE 109 8 0 16 Oct 2024
Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models Kai Yao P. Gao Lichun Li Yuan Zhao Xiaofeng Wang Wei Wang Jianke Zhu 56 2 0 15 Oct 2024
TSDS: Data Selection for Task-Specific Model Finetuning Zifan Liu Amin Karbasi Theodoros Rekatsinas 78 6 0 15 Oct 2024
Arrhythmia Classification Using Graph Neural Networks Based on Correlation Matrix Seungwoo Han 72 0 0 14 Oct 2024
Lambda-Skip Connections: the architectural component that prevents Rank Collapse Federico Arangath Joseph Jerome Sieber Melanie Zeilinger Carmen Amo Alonso 222 0 0 14 Oct 2024
Mental Disorders Detection in the Era of Large Language Models Gleb Kuzmin Petr Strepetov Maksim Stankevich Artem Shelmanov Ivan Smirnov 43 1 0 09 Oct 2024
Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy Tong Wu Shujian Zhang Kaiqiang Song Silei Xu Sanqiang Zhao Ravi Agrawal Sathish Indurthi Chong Xiang Prateek Mittal Wenxuan Zhou 112 14 0 09 Oct 2024
Towards the generation of hierarchical attack models from cybersecurity vulnerabilities using language models Kacper Sowka Vasile Palade Xiaorui Jiang Hesam Jadidbonab 87 1 0 07 Oct 2024
Computational design of target-specific linear peptide binders with TransformerBeta Haowen Zhao Francesco A. Aprile Barbara Bravi 77 0 0 07 Oct 2024
Regularized Neural Ensemblers Sebastian Pineda Arango Maciej Janowski Lennart Purucker Arber Zela Frank Hutter Josif Grabocka UQCV 95 0 0 06 Oct 2024
Variational Language Concepts for Interpreting Foundation Language Models Hengyi Wang Shiwei Tan Zhiqing Hong Desheng Zhang Hao Wang 144 3 0 04 Oct 2024
Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding with LLMs Wei Wu Chao Wang L. Chen Mingze Yin Yiheng Zhu Kun Fu Jieping Ye Hui Xiong Zheng Wang 143 1 0 04 Oct 2024
Demystifying the Token Dynamics of Deep Selective State Space Models Thieu N. Vo Tung D. Pham Xin T. Tong Tan Minh Nguyen Mamba 131 0 0 04 Oct 2024
Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor Factorization for Compression of Generative Language Models Mingxue Xu Sadia Sharmin Danilo Mandic 72 2 0 03 Oct 2024
Morphological evaluation of subwords vocabulary used by BETO language model Óscar García-Sierra Ana Fernández-Pampillón Cesteros Miguel Ortega-Martín 61 0 0 03 Oct 2024
DeIDClinic: A Multi-Layered Framework for De-identification of Clinical Free-text Data Angel Paul Dhivin Shaji Lifeng Han Warren Del-Pinto Goran Nenadic OOD 62 0 0 02 Oct 2024
DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models Yuxuan Zhang Ruizhe Li MoMe 199 2 0 02 Oct 2024
On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding Kevin Xu Issei Sato 120 4 0 02 Oct 2024
Depression detection in social media posts using transformer-based models and auxiliary features Marios Kerasiotis Loukas Ilias D. Askounis 79 6 0 30 Sep 2024
FINE: Factorizing Knowledge for Initialization of Variable-sized Diffusion Models Yucheng Xie Fu Feng Ruixiao Shi Jing Wang Xin Geng AI4CE 70 3 0 28 Sep 2024
On the Inductive Bias of Stacking Towards Improving Reasoning Nikunj Saunshi Stefani Karp Shankar Krishnan Sobhan Miryoosefi Sashank J. Reddi Sanjiv Kumar LRM AI4CE 86 7 0 27 Sep 2024
Meta-RTL: Reinforcement-Based Meta-Transfer Learning for Low-Resource Commonsense Reasoning Yu Fu Jie He Yifan Yang Qun Liu Deyi Xiong OffRL LRM 105 0 0 27 Sep 2024
DisGeM: Distractor Generation for Multiple Choice Questions with Span Masking Devrim Cavusoglu Secil Sen Ulas Sert 63 0 0 26 Sep 2024
Integrating Hierarchical Semantic into Iterative Generation Model for Entailment Tree Explanation Qin Wang Jianzhou Feng Yiming Xu 55 0 0 26 Sep 2024
SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion Ming Dai Lingfeng Yang Yihao Xu Zhenhua Feng Wankou Yang ObjD 125 13 0 26 Sep 2024
Pre-trained Language Models Return Distinguishable Probability Distributions to Unfaithfully Hallucinated Texts Taehun Cha Donghun Lee HILM 62 1 0 25 Sep 2024
dnaGrinder: a lightweight and high-capacity genomic foundation model Qihang Zhao Chi Zhang Weixiong Zhang 53 0 0 24 Sep 2024
ToxiCraft: A Novel Framework for Synthetic Generation of Harmful Information Zheng Hui Zhaoxiao Guo Hang Zhao Juanyong Duan Congrui Huang 145 7 0 23 Sep 2024
Data-centric NLP Backdoor Defense from the Lens of Memorization Zhenting Wang Zhizhi Wang Mingyu Jin Mengnan Du Juan Zhai Shiqing Ma 86 3 0 21 Sep 2024
Normalized Narrow Jump To Conclusions: Normalized Narrow Shortcuts for Parameter Efficient Early Exit Transformer Prediction Amrit Diggavi Seshadri 50 1 0 21 Sep 2024
FAMOUS: Flexible Accelerator for the Attention Mechanism of Transformer on UltraScale+ FPGAs Ehsan Kabir Md. Arafat Kabir Austin R. J. Downey Jason D. Bakos David Andrews Miaoqing Huang GNN 66 0 0 21 Sep 2024
Profiling Patient Transcript Using Large Language Model Reasoning Augmentation for Alzheimer's Disease Detection Chin-Po Chen Jeng-Lin Li LM&MA 28 1 0 19 Sep 2024
Evaluation of pretrained language models on music understanding Yannis Vasilakis Rachel M. Bittner Johan Pauwels 99 1 0 17 Sep 2024
OneEncoder: A Lightweight Framework for Progressive Alignment of Modalities Bilal Faye Hanane Azzag M. Lebbah ObjD 105 0 0 17 Sep 2024
Towards Data-Centric RLHF: Simple Metrics for Preference Dataset Comparison Judy Hanwen Shen Archit Sharma Jun Qin 70 5 0 15 Sep 2024
Deep Fast Machine Learning Utils: A Python Library for Streamlined Machine Learning Prototyping Fabi Prezja AI4CE 73 0 0 14 Sep 2024
Multi-intent Aware Contrastive Learning for Sequential Recommendation Junshu Huang Zi Long Xianghua Fu Yin Chen HAI 34 0 0 13 Sep 2024
A BERT-Based Summarization approach for depression detection Hossein Salahshoor Gavalan Mohmmad Naim Rastgoo Bahareh Nakisa 63 2 0 13 Sep 2024
TheraGen: Therapy for Every Generation Kartikey Doshi Jimit Shah Narendra Shekokar AI4MH 53 0 0 12 Sep 2024
Enhancing adversarial robustness in Natural Language Inference using explanations Alexandros Koulakos Maria Lymperaiou Giorgos Filandrianos Giorgos Stamou SILM AAML 132 2 0 11 Sep 2024
DA-MoE: Towards Dynamic Expert Allocation for Mixture-of-Experts Models Maryam Akhavan Aghdam Hongpeng Jin Yanzhao Wu MoE 63 3 0 10 Sep 2024
DetoxBench: Benchmarking Large Language Models for Multitask Fraud & Abuse Detection Joymallya Chakraborty Wei Xia Anirban Majumder Dan Ma Walid Chaabene Naveed Janvekar 49 3 0 09 Sep 2024
Application Specific Compression of Deep Learning Models Rohit Raj Rai Angana Borah Amit Awekar 48 0 0 09 Sep 2024
Driving with Prior Maps: Unified Vector Prior Encoding for Autonomous Vehicle Mapping Shuang Zeng Xinyuan Chang Xinran Liu Zheng Pan Xing Wei 126 3 0 09 Sep 2024
Expanding Expressivity in Transformer Models with MöbiusAttention Anna-Maria Halacheva M. Nayyeri Steffen Staab 78 1 0 08 Sep 2024
Achieving Peak Performance for Large Language Models: A Systematic Review Z. R. K. Rostam Sándor Szénási Gábor Kertész 95 5 0 07 Sep 2024
An Effective Deployment of Diffusion LM for Data Augmentation in Low-Resource Sentiment Classification Zhuowei Chen Lianxi Wang Yuben Wu Xinfeng Liao Yujia Tian Junyang Zhong DiffM 107 1 0 05 Sep 2024
Pre-Trained Language Models for Keyphrase Prediction: A Review Muhammad Umair Tangina Sultana Young-Koo Lee 80 4 0 02 Sep 2024
From Prediction to Application: Language Model-based Code Knowledge Tracing with Domain Adaptive Pre-Training and Automatic Feedback System with Pedagogical Prompting for Comprehensive Programming Education Unggi Lee Jiyeong Bae Yeonji Jung Minji Kang Gyuri Byun ... Sookbun Lee Jaekwon Park Taekyung Ahn Gunho Lee Hyeoncheol Kim AI4Ed KELM 83 1 0 31 Aug 2024