v1v2v3v4v5v6 (latest)

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019

ArXiv (abs)PDF HTML Github (3271★)

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,935 papers shown

Title
BERT Lost Patience Won't Be Robust to Adversarial Slowdown Zachary Coalson Gabriel Ritter Rakesh Bobba Sanghyun Hong AAML 47 2 0 29 Oct 2023
Stacking the Odds: Transformer-Based Ensemble for AI-Generated Text Detection Duke Nguyen Khaing Myat Noe Naing Aditya Joshi 61 7 0 29 Oct 2023
Multi-grained Evidence Inference for Multi-choice Reading Comprehension Yilin Zhao Hai Zhao Sufeng Duan 62 2 0 27 Oct 2023
Outlier Dimensions Encode Task-Specific Knowledge William Rudman Catherine Chen Carsten Eickhoff 65 5 0 26 Oct 2023
PETA: Evaluating the Impact of Protein Transfer Learning with Sub-word Tokenization on Downstream Applications Yang Tan Mingchen Li P. Tan Ziyi Zhou Huiqun Yu Guisheng Fan Liang Hong 65 0 0 26 Oct 2023
Understanding the Role of Input Token Characters in Language Models: How Does Information Loss Affect Performance? Ahmed Alajrami Katerina Margatina Nikolaos Aletras AAML 65 1 0 26 Oct 2023
Joint Entity and Relation Extraction with Span Pruning and Hypergraph Neural Networks Zhaohui Yan Aaron Courville Wei Liu Kewei Tu 109 13 0 26 Oct 2023
Apollo: Zero-shot MultiModal Reasoning with Multiple Experts Daniela Ben-David Tzuf Paz-Argaman Reut Tsarfaty MoE 68 0 0 25 Oct 2023
Kiki or Bouba? Sound Symbolism in Vision-and-Language Models Morris Alper Hadar Averbuch-Elor 110 10 0 25 Oct 2023
FedTherapist: Mental Health Monitoring with User-Generated Linguistic Expressions on Smartphones via Federated Learning Jaemin Shin Hyungjun Yoon Seungjoo Lee Sungjoon Park Yunxin Liu Jinho D. Choi Sung-Ju Lee 70 6 0 25 Oct 2023
Subspace Chronicles: How Linguistic Information Emerges, Shifts and Interacts during Language Model Training Max Müller-Eberstein Rob van der Goot Barbara Plank Ivan Titov 129 10 0 25 Oct 2023
URL-BERT: Training Webpage Representations via Social Media Engagements A. Qamar Chetan Verma Ahmed El-Kishky Sumit Binnani Sneha Mehta Taylor Berg-Kirkpatrick 60 0 0 25 Oct 2023
CR-COPEC: Causal Rationale of Corporate Performance Changes to Learn from Financial Reports Ye Eun Chun Sunjae Kwon Kyung-Woo Sohn Nakwon Sung Junyoup Lee Byungki Seo Kevin Compher Seung-won Hwang Jaesik Choi 76 1 0 24 Oct 2023
Retrieval-based Knowledge Transfer: An Effective Approach for Extreme Large Language Model Compression Jiduan Liu Jiahao Liu Qifan Wang Jingang Wang Xunliang Cai Dongyan Zhao Ran Wang Rui Yan 61 4 0 24 Oct 2023
TRAMS: Training-free Memory Selection for Long-range Language Modeling Haofei Yu Cunxiang Wang Yue Zhang Wei Bi RALM 100 6 0 24 Oct 2023
CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model Kaiyan Zhang Ning Ding Biqing Qi Xuekai Zhu Xinwei Long Bowen Zhou 95 5 0 24 Oct 2023
PartialFormer: Modeling Part Instead of Whole for Machine Translation Tong Zheng Bei Li Huiwen Bao Jiale Wang Weiqiao Shan Tong Xiao Jingbo Zhu MoE AI4CE 45 0 0 23 Oct 2023
Unveiling the Multi-Annotation Process: Examining the Influence of Annotation Quantity and Instance Difficulty on Model Performance Pritam Kadasi Mayank Singh 57 3 0 23 Oct 2023
PromptCBLUE: A Chinese Prompt Tuning Benchmark for the Medical Domain Wei-wei Zhu Xiaoling Wang Huanran Zheng Mosha Chen Buzhou Tang ELM LM&MA 69 36 0 22 Oct 2023
Transductive Learning for Textual Few-Shot Classification in API-based Embedding Models Pierre Colombo Victor Pellegrain Malik Boudiaf Victor Storchan Myriam Tami Ismail Ben Ayed C´eline Hudelot Pablo Piantanida 99 8 0 21 Oct 2023
A Novel Information-Theoretic Objective to Disentangle Representations for Fair Classification Pierre Colombo Nathan Noiry Guillaume Staerman Pablo Piantanida FaML DRL 78 1 0 21 Oct 2023
Plausibility Processing in Transformer Language Models: Focusing on the Role of Attention Heads in GPT Soo Hyun Ryu 45 0 0 20 Oct 2023
The Less the Merrier? Investigating Language Representation in Multilingual Models H. Nigatu A. Tonja Jugal Kalita 76 1 0 20 Oct 2023
Unsupervised Candidate Answer Extraction through Differentiable Masker-Reconstructor Model Zhuoer Wang Yicheng Wang Ziwei Zhu James Caverlee 80 0 0 19 Oct 2023
A Predictive Factor Analysis of Social Biases and Task-Performance in Pretrained Masked Language Models Yi Zhou Jose Camacho-Collados Danushka Bollegala 153 6 0 19 Oct 2023
Boosting Inference Efficiency: Unleashing the Power of Parameter-Shared Pre-trained Language Models Weize Chen Xiaoyue Xu Xu Han Yankai Lin Ruobing Xie Zhiyuan Liu Maosong Sun Jie Zhou 39 0 0 19 Oct 2023
Character-level Chinese Backpack Language Models Hao Sun John Hewitt 59 0 0 19 Oct 2023
Time-Aware Representation Learning for Time-Sensitive Question Answering Jungbin Son Alice Oh 70 6 0 19 Oct 2023
Pretraining Language Models with Text-Attributed Heterogeneous Graphs Tao Zou Le Yu Yifei Huang Leilei Sun Bo Du AI4CE 62 17 0 19 Oct 2023
DepWiGNN: A Depth-wise Graph Neural Network for Multi-hop Spatial Reasoning in Text Shuaiyi Li Yang Deng Wai Lam 93 2 0 19 Oct 2023
SPEED: Speculative Pipelined Execution for Efficient Decoding Coleman Hooper Sehoon Kim Hiva Mohammadzadeh Hasan Genç Kurt Keutzer A. Gholami Y. Shao 77 41 0 18 Oct 2023
DesignQuizzer: A Community-Powered Conversational Agent for Learning Visual Design Zhenhui Peng Qiaoyi Chen Zhiyu Shen Xiaojuan Ma Antti Oulasvirta 54 5 0 18 Oct 2023
Improving Long Document Topic Segmentation Models With Enhanced Coherence Modeling Hai Yu Chong Deng Qinglin Zhang Jiaqing Liu Qian Chen Wen Wang AI4TS 90 10 0 18 Oct 2023
Chain-of-Thought Tuning: Masked Language Models can also Think Step By Step in Natural Language Understanding Caoyun Fan Jidong Tian Yitian Li Wenqing Chen Hao He Yaohui Jin LRM 71 4 0 18 Oct 2023
Disentangling the Linguistic Competence of Privacy-Preserving BERT Stefan Arnold Nils Kemmerzell Annika Schreiner 77 0 0 17 Oct 2023
QADYNAMICS: Training Dynamics-Driven Synthetic QA Diagnostic for Zero-Shot Commonsense Question Answering Haochen Shi Weiqi Wang Tianqing Fang Baixuan Xu Wenxuan Ding Xin Liu Yangqiu Song 113 7 0 17 Oct 2023
Survey of Vulnerabilities in Large Language Models Revealed by Adversarial Attacks Erfan Shayegani Md Abdullah Al Mamun Yu Fu Pedram Zaree Yue Dong Nael B. Abu-Ghazaleh AAML 238 163 0 16 Oct 2023
PELA: Learning Parameter-Efficient Models with Low-Rank Approximation Yangyang Guo Guangzhi Wang Mohan S. Kankanhalli 38 3 0 16 Oct 2023
Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer Boan Liu Liang Ding Li Shen Keqin Peng Yu Cao Dazhao Cheng Dacheng Tao MoE 80 9 0 15 Oct 2023
CarExpert: Leveraging Large Language Models for In-Car Conversational Question Answering Md. Rony Christian Suess Sinchana Ramakanth Bhat Viju Sudhi Julia Schneider Maximilian Vogel Roman Teucher Ken E. Friedl S. Sahoo 71 11 0 14 Oct 2023
Low-Resource Clickbait Spoiling for Indonesian via Question Answering Ni Putu Intan Maharani Ayu Purwarianti Alham Fikri Aji 65 2 0 12 Oct 2023
To token or not to token: A Comparative Study of Text Representations for Cross-Lingual Transfer Md. Mushfiqur Rahman Fardin Ahsan Sakib Fahim Faisal Antonios Anastasopoulos 60 3 0 12 Oct 2023
Pit One Against Many: Leveraging Attention-head Embeddings for Parameter-efficient Multi-head Attention Huiyin Xue Nikolaos Aletras 102 0 0 11 Oct 2023
On the Relationship between Sentence Analogy Identification and Sentence Structure Encoding in Large Language Models Thilini Wijesiriwardene Ruwan Wickramarachchi Aishwarya N. Reganti Vinija Jain Aman Chadha Amit P. Sheth Amitava Das 56 1 0 11 Oct 2023
The Temporal Structure of Language Processing in the Human Brain Corresponds to The Layered Hierarchy of Deep Language Models Ariel Goldstein Eric Ham Mariano Schain Samuel A. Nastase Zaid Zada ... Avinatan Hassidim O. Devinsky A. Flinker Omer Levy Uri Hasson AI4CE 68 10 0 11 Oct 2023
Sparse Universal Transformer Shawn Tan Songlin Yang Zhenfang Chen Aaron Courville Chuang Gan MoE 82 15 0 11 Oct 2023
A Comparative Study of Transformer-based Neural Text Representation Techniques on Bug Triaging Atish Kumar Dipongkor Kevin Moran 23 8 0 10 Oct 2023
P5: Plug-and-Play Persona Prompting for Personalized Response Selection Joosung Lee Min Sik Oh Donghun Lee 62 3 0 10 Oct 2023
Model Tuning or Prompt Tuning? A Study of Large Language Models for Clinical Concept and Relation Extraction C.A.I. Peng Xi Yang Kaleb E. Smith Zehao Yu Aokun Chen Jiang Bian Yonghui Wu VLM LRM 85 32 0 10 Oct 2023
Evolution of Natural Language Processing Technology: Not Just Language Processing Towards General Purpose AI Masahiro Yamamoto 54 1 0 10 Oct 2023