Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2207.00112
Cited By
Language model compression with weighted low-rank factorization
30 June 2022
Yen-Chang Hsu
Ting Hua
Sung-En Chang
Qiang Lou
Yilin Shen
Hongxia Jin
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Language model compression with weighted low-rank factorization"
28 / 78 papers shown
Title
A phase transition between positional and semantic learning in a solvable model of dot-product attention
Hugo Cui
Freya Behrens
Florent Krzakala
Lenka Zdeborová
MLT
98
16
0
06 Feb 2024
A Comprehensive Survey of Compression Algorithms for Language Models
Seungcheol Park
Jaehyeon Choi
Sojin Lee
U. Kang
MQ
118
16
0
27 Jan 2024
Harnessing Orthogonality to Train Low-Rank Neural Networks
D. Coquelin
Katharina Flügel
Marie Weiel
Nicholas Kiefer
Charlotte Debus
Achim Streit
Markus Goetz
86
1
0
16 Jan 2024
Fast Inference of Mixture-of-Experts Language Models with Offloading
Artyom Eliseev
Denis Mazur
MoE
121
44
0
28 Dec 2023
Towards Message Brokers for Generative AI: Survey, Challenges, and Opportunities
Alaa Saleh
Roberto Morabito
Sasu Tarkoma
Susanna Pirttikangas
Lauri Lovén
120
4
0
22 Dec 2023
DSFormer: Effective Compression of Text-Transformers by Dense-Sparse Weight Factorization
Rahul Chand
Yashoteja Prabhu
Pratyush Kumar
64
3
0
20 Dec 2023
A Survey of Reasoning with Foundation Models
Jiankai Sun
Chuanyang Zheng
Enze Xie
Zhengying Liu
Ruihang Chu
...
Xipeng Qiu
Yi-Chen Guo
Hui Xiong
Qun Liu
Zhenguo Li
ReLM
LRM
AI4CE
209
85
0
17 Dec 2023
ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models
Zhihang Yuan
Yuzhang Shang
Yue Song
Qiang Wu
Yan Yan
Guangyu Sun
MQ
127
61
0
10 Dec 2023
LQ-LoRA: Low-rank Plus Quantized Matrix Decomposition for Efficient Language Model Finetuning
Han Guo
P. Greengard
Eric P. Xing
Yoon Kim
MQ
148
57
0
20 Nov 2023
Differentiable Learning of Generalized Structured Matrices for Efficient Deep Neural Networks
Changwoo Lee
Hun-Seok Kim
59
3
0
29 Oct 2023
Neural Language Model Pruning for Automatic Speech Recognition
Leonardo Emili
Thiago Fraga-Silva
Ernest Pusateri
M. Nußbaum-Thom
Youssef Oualil
83
1
0
05 Oct 2023
LORD: Low Rank Decomposition Of Monolingual Code LLMs For One-Shot Compression
Ayush Kaushal
Tejas Vaidhya
Irina Rish
129
16
0
25 Sep 2023
Incrementally-Computable Neural Networks: Efficient Inference for Dynamic Inputs
Or Sharir
Anima Anandkumar
60
0
0
27 Jul 2023
Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Tom Lieberum
Matthew Rahtz
János Kramár
Neel Nanda
G. Irving
Rohin Shah
Vladimir Mikulik
103
115
0
18 Jul 2023
Low-Rank Prune-And-Factorize for Language Model Compression
Siyu Ren
Kenny Q. Zhu
95
9
0
25 Jun 2023
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation
Yixiao Li
Yifan Yu
Qingru Zhang
Chen Liang
Pengcheng He
Weizhu Chen
Tuo Zhao
132
76
0
20 Jun 2023
Efficient Alternating Minimization with Applications to Weighted Low Rank Approximation
Zhao Song
Mingquan Ye
Junze Yin
Licheng Zhang
59
7
0
07 Jun 2023
COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models
Jinqi Xiao
Miao Yin
Yu Gong
Xiao Zang
Jian Ren
Bo Yuan
VLM
ViT
134
9
0
26 May 2023
Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech Recognition Models
Steven M. Hernandez
Ding Zhao
Shaojin Ding
A. Bruguier
Rohit Prabhavalkar
Tara N. Sainath
Yanzhang He
Ian McGraw
112
9
0
15 Mar 2023
TrojText: Test-time Invisible Textual Trojan Insertion
Qiang Lou
Ye Liu
Bo Feng
145
27
0
03 Mar 2023
Revisiting Offline Compression: Going Beyond Factorization-based Methods for Transformer Language Models
Mohammadreza Banaei
Klaudia Bałazy
Artur Kasymov
R. Lebret
Jacek Tabor
Karl Aberer
OffRL
52
0
0
08 Feb 2023
Tensor Networks Meet Neural Networks: A Survey and Future Perspectives
Maolin Wang
Yu Pan
Zenglin Xu
Xiangli Yang
Guangxi Li
A. Cichocki
Andrzej Cichocki
210
22
0
22 Jan 2023
Is Smaller Always Faster? Tradeoffs in Compressing Self-Supervised Speech Transformers
Tzu-Quan Lin
Tsung-Huan Yang
Chun-Yao Chang
Kuang-Ming Chen
Tzu-hsun Feng
Hung-yi Lee
Hao Tang
84
6
0
17 Nov 2022
Numerical Optimizations for Weighted Low-rank Estimation on Language Model
Ting Hua
Yen-Chang Hsu
Felicity Wang
Qiang Lou
Yilin Shen
Hongxia Jin
89
15
0
02 Nov 2022
TrojViT: Trojan Insertion in Vision Transformers
Mengxin Zheng
Qian Lou
Lei Jiang
178
56
0
27 Aug 2022
Survey on Evolutionary Deep Learning: Principles, Algorithms, Applications and Open Issues
Nan Li
Lianbo Ma
Guo-Ding Yu
Bing Xue
Mengjie Zhang
Yaochu Jin
83
79
0
23 Aug 2022
Rank Diminishing in Deep Neural Networks
Ruili Feng
Kecheng Zheng
Yukun Huang
Deli Zhao
Michael I. Jordan
Zhengjun Zha
100
33
0
13 Jun 2022
Nonlinear Initialization Methods for Low-Rank Neural Networks
Kiran Vodrahalli
Rakesh Shivanna
M. Sathiamoorthy
Sagar Jain
Ed H. Chi
83
4
0
02 Feb 2022
Previous
1
2