Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.11942
Cited By
v1
v2
v3
v4
v5
v6 (latest)
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Github (3271★)
Papers citing
"ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"
50 / 2,935 papers shown
Title
The emergence of clusters in self-attention dynamics
Borjan Geshkovski
Cyril Letrouit
Yury Polyanskiy
Philippe Rigollet
111
56
0
09 May 2023
Alleviating Over-smoothing for Unsupervised Sentence Representation
Nuo Chen
Linjun Shou
Ming Gong
Jian Pei
Bowen Cao
Jianhui Chang
Daxin Jiang
Jia Li
SSL
71
18
0
09 May 2023
Who Needs Decoders? Efficient Estimation of Sequence-level Attributes
Yassir Fathullah
Puria Radmard
Adian Liusie
Mark Gales
OODD
70
1
0
09 May 2023
ANALOGICAL -- A Novel Benchmark for Long Text Analogy Evaluation in Large Language Models
Thilini Wijesiriwardene
Ruwan Wickramarachchi
Bimal Gajera
Shreeyash Mukul Gowaikar
Chandan Gupta
Aman Chadha
Aishwarya N. Reganti
Amit P. Sheth
Amitava Das
ELM
81
14
0
08 May 2023
The EarlyBIRD Catches the Bug: On Exploiting Early Layers of Encoder Models for More Efficient Code Classification
Anastasiia Grishina
Max Hort
Leon Moonen
62
6
0
08 May 2023
Toward Adversarial Training on Contextualized Language Representation
Hongqiu Wu
Yang Liu
Han Shi
Haizhen Zhao
Hao Fei
AAML
49
14
0
08 May 2023
RFR-WWANet: Weighted Window Attention-Based Recovery Feature Resolution Network for Unsupervised Image Registration
Mingrui Ma
Tao Wang
Lei Song
Weijie Wang
Gui-Xian Liu
ViT
MedIm
51
2
0
07 May 2023
A Minimal Approach for Natural Language Action Space in Text-based Games
Dongwon Kelvin Ryu
Meng Fang
Shirui Pan
Gholamreza Haffari
Ehsan Shareghi
LLMAG
71
2
0
06 May 2023
Transformer Working Memory Enables Regular Language Reasoning and Natural Language Length Extrapolation
Ta-Chung Chi
Ting-Han Fan
Alexander I. Rudnicky
Peter J. Ramadge
LRM
75
13
0
05 May 2023
LMs stand their Ground: Investigating the Effect of Embodiment in Figurative Language Interpretation by Language Models
Philipp Wicke
53
4
0
05 May 2023
Efficient k-NN Search with Cross-Encoders using Adaptive Multi-Round CUR Decomposition
Nishant Yadav
Nicholas Monath
Manzil Zaheer
Andrew McCallum
55
1
0
04 May 2023
Masked Structural Growth for 2x Faster Language Model Pre-training
Yiqun Yao
Zheng Zhang
Jing Li
Yequan Wang
OffRL
AI4CE
LRM
95
16
0
04 May 2023
Cuttlefish: Low-Rank Model Training without All the Tuning
Hongyi Wang
Saurabh Agarwal
Pongsakorn U-chupala
Yoshiki Tanaka
Eric P. Xing
Dimitris Papailiopoulos
OffRL
154
23
0
04 May 2023
PTP: Boosting Stability and Performance of Prompt Tuning with Perturbation-Based Regularizer
Lichang Chen
Heng-Chiao Huang
Varun Madhavan
AAML
176
12
0
03 May 2023
GPT-RE: In-context Learning for Relation Extraction using Large Language Models
Michele Focchi
Fei Cheng
Zhuoyuan Mao
Qianying Liu
Haiyue Song
Jiwei Li
Sadao Kurohashi
LRM
117
94
0
03 May 2023
Pre-train and Search: Efficient Embedding Table Sharding with Pre-trained Neural Cost Models
Daochen Zha
Louis Feng
Liangchen Luo
Bhargav Bhushanam
Zirui Liu
...
J. McMahon
Yuzhen Huang
Bryan Clarke
A. Kejariwal
Helen Zhou
113
7
0
03 May 2023
A Data-Driven Approach for Finding Requirements Relevant Feedback from TikTok and YouTube
Manish Sihag
Ze Shi Li
Amanda Dash
Nowshin Nawar Arony
Kezia Devathasan
Neil A. Ernst
A. Albu
D. Damian
43
7
0
02 May 2023
BrainNPT: Pre-training of Transformer networks for brain network classification
Jinlong Hu
Ya-Lin Huang
Nan Wang
Shoubin Dong
ViT
MedIm
96
8
0
02 May 2023
SLSG: Industrial Image Anomaly Detection by Learning Better Feature Embeddings and One-Class Classification
Minghui Yang
Jing Liu
Zhiwei Yang
Zhaoyang Wu
71
10
0
30 Apr 2023
Empirical Analysis of the Strengths and Weaknesses of PEFT Techniques for LLMs
George Pu
Anirudh Jain
Jihan Yin
Russell Kaplan
75
43
0
28 Apr 2023
Information Redundancy and Biases in Public Document Information Extraction Benchmarks
S. Laatiri
Pirashanth Ratnamogan
Joel Tang
Laurent Lam
William Vanhuffel
Fabien Caspani
35
1
0
28 Apr 2023
ResiDual: Transformer with Dual Residual Connections
Shufang Xie
Huishuai Zhang
Junliang Guo
Xu Tan
Jiang Bian
Hany Awadalla
Arul Menezes
Tao Qin
Rui Yan
99
19
0
28 Apr 2023
SweCTRL-Mini: a data-transparent Transformer-based large language model for controllable text generation in Swedish
Dmytro Kalpakchi
Johan Boye
SyDa
49
3
0
27 Apr 2023
Towards ethical multimodal systems
Alexis Roger
Esma Aïmeur
Irina Rish
54
3
0
26 Apr 2023
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Jingfeng Yang
Hongye Jin
Ruixiang Tang
Xiaotian Han
Qizhang Feng
Haoming Jiang
Bing Yin
Helen Zhou
LM&MA
214
682
0
26 Apr 2023
PEFT-Ref: A Modular Reference Architecture and Typology for Parameter-Efficient Finetuning Techniques
Mohammed Sabry
Anya Belz
104
8
0
24 Apr 2023
Semantic Tokenizer for Enhanced Natural Language Processing
Sandeep Mehta
Darpan Shah
Ravindra Kulkarni
Cornelia Caragea
VLM
26
3
0
24 Apr 2023
Hierarchical Contrastive Learning Enhanced Heterogeneous Graph Neural Network
Nian Liu
Tianlin Li
Hui-jun Han
Chuan Shi
SSL
39
5
0
24 Apr 2023
Pre-trained Embeddings for Entity Resolution: An Experimental Analysis [Experiment, Analysis & Benchmark]
Alexandros Zeakis
G. Papadakis
Dimitrios Skoutas
Manolis Koubarakis
78
39
0
24 Apr 2023
IslamicPCQA: A Dataset for Persian Multi-hop Complex Question Answering in Islamic Text Resources
A. Ghafouri
Hasan Naderi
MohammadMahdi Aghajani
Mahdi Firouzmandi
58
2
0
23 Apr 2023
Differentiate ChatGPT-generated and Human-written Medical Texts
Wenxiong Liao
Zheng Liu
Haixing Dai
Shaochen Xu
Zihao Wu
...
Xiaoke Huang
Dajiang Zhu
Hongmin Cai
Tianming Liu
Xiang Li
LM&MA
DeLMO
MedIm
AI4MH
57
60
0
23 Apr 2023
Processing Natural Language on Embedded Devices: How Well Do Transformer Models Perform?
Souvik Sarkar
Mohammad Fakhruddin Babar
Md. Mahadi Hassan
M. Hasan
Shubhra (Santu) Karmaker
32
1
0
23 Apr 2023
GPT-NER: Named Entity Recognition via Large Language Models
Shuhe Wang
Xiaofei Sun
Xiaoya Li
Rongbin Ouyang
Leilei Gan
Tianwei Zhang
Jiwei Li
Guoyin Wang
106
201
0
20 Apr 2023
Clover: Toward Sustainable AI with Carbon-Aware Machine Learning Inference Service
Baolin Li
S. Samsi
V. Gadepally
Devesh Tiwari
69
32
0
19 Apr 2023
SemEval 2023 Task 6: LegalEval - Understanding Legal Texts
Ashutosh Modi
Prathamesh Kalamkar
S. Karn
Aman Tiwari
Abhinav Joshi
Sai Kiran Tanikella
S. Guha
Sachin Malhan
Vivek Raghavan
ELM
AILaw
55
42
0
19 Apr 2023
MixPro: Simple yet Effective Data Augmentation for Prompt-based Learning
Bohan Li
Longxu Dou
Yutai Hou
Yunlong Feng
Honglin Mu
Qingfu Zhu
Qinghua Sun
Wanxiang Che
VLM
74
4
0
19 Apr 2023
Shuffle & Divide: Contrastive Learning for Long Text
Joonseok Lee
Seongho Joe
Kyoungwon Park
Bogun Kim
Ho. Kang
Jaeseon Park
Youngjune Gwon
42
0
0
19 Apr 2023
Exploring the Trade-Offs: Unified Large Language Models vs Local Fine-Tuned Models for Highly-Specific Radiology NLI Task
Zihao Wu
Lu Zhang
Chao-Yang Cao
Xiao-Xing Yu
Haixing Dai
...
Quanzheng Li
Dinggang Shen
Xiang Li
Dajiang Zhu
Tianming Liu
LM&MA
66
39
0
18 Apr 2023
MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised Learning
Zheng Lian
Haiyang Sun
Guoying Zhao
Kang Chen
Mingyu Xu
...
Meng Wang
Min Zhang
Guoying Zhao
Björn W. Schuller
Jianhua Tao
96
51
0
18 Apr 2023
Masked Language Model Based Textual Adversarial Example Detection
Xiaomei Zhang
Zhaoxi Zhang
Qi Zhong
Xufei Zheng
Yanjun Zhang
Shengshan Hu
L. Zhang
AAML
101
2
0
18 Apr 2023
A Survey for Biomedical Text Summarization: From Pre-trained to Large Language Models
Qianqian Xie
Zheheng Luo
Benyou Wang
Sophia Ananiadou
LM&MA
VLM
53
11
0
18 Apr 2023
Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and Evaluation
Yunjie Ji
Yan Gong
Yong Deng
Yiping Peng
Qiang Niu
Baochang Ma
Xiangang Li
ALM
ELM
102
25
0
16 Apr 2023
MisRoBÆRTa: Transformers versus Misinformation
Ciprian-Octavian Truică
Elena Simona Apostol
55
39
0
16 Apr 2023
STen: Productive and Efficient Sparsity in PyTorch
Andrei Ivanov
Nikoli Dryden
Tal Ben-Nun
Saleh Ashkboos
Torsten Hoefler
64
4
0
15 Apr 2023
Learn What Is Possible, Then Choose What Is Best: Disentangling One-To-Many Relations in Language Through Text-based Games
Benjamin Towle
Ke Zhou
SyDa
56
4
0
14 Apr 2023
TimelyFL: Heterogeneity-aware Asynchronous Federated Learning with Adaptive Partial Training
Tuo Zhang
Lei Gao
Sunwoo Lee
Mi Zhang
Salman Avestimehr
FedML
96
30
0
14 Apr 2023
Evaluation of Social Biases in Recent Large Pre-Trained Models
Swapnil Sharma
Nikita Anand
V. KranthiKiranG.
Alind Jain
50
0
0
13 Apr 2023
In-Distribution and Out-of-Distribution Self-supervised ECG Representation Learning for Arrhythmia Detection
S. Soltanieh
J. Hashemi
Ali Etemad
85
12
0
13 Apr 2023
MoMo: A shared encoder Model for text, image and multi-Modal representations
Rakesh Chada
Zhao-Heng Zheng
P. Natarajan
ViT
59
4
0
11 Apr 2023
Neural Delay Differential Equations: System Reconstruction and Image Classification
Qunxi Zhu
Yao Guo
Wei Lin
71
33
0
11 Apr 2023
Previous
1
2
3
...
18
19
20
...
57
58
59
Next