ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.00112
  4. Cited By
Language model compression with weighted low-rank factorization

Language model compression with weighted low-rank factorization

30 June 2022
Yen-Chang Hsu
Ting Hua
Sung-En Chang
Qiang Lou
Yilin Shen
Hongxia Jin
ArXiv (abs)PDFHTML

Papers citing "Language model compression with weighted low-rank factorization"

28 / 78 papers shown
Title
A phase transition between positional and semantic learning in a
  solvable model of dot-product attention
A phase transition between positional and semantic learning in a solvable model of dot-product attention
Hugo Cui
Freya Behrens
Florent Krzakala
Lenka Zdeborová
MLT
98
16
0
06 Feb 2024
A Comprehensive Survey of Compression Algorithms for Language Models
A Comprehensive Survey of Compression Algorithms for Language Models
Seungcheol Park
Jaehyeon Choi
Sojin Lee
U. Kang
MQ
118
16
0
27 Jan 2024
Harnessing Orthogonality to Train Low-Rank Neural Networks
Harnessing Orthogonality to Train Low-Rank Neural Networks
D. Coquelin
Katharina Flügel
Marie Weiel
Nicholas Kiefer
Charlotte Debus
Achim Streit
Markus Goetz
86
1
0
16 Jan 2024
Fast Inference of Mixture-of-Experts Language Models with Offloading
Fast Inference of Mixture-of-Experts Language Models with Offloading
Artyom Eliseev
Denis Mazur
MoE
121
44
0
28 Dec 2023
Towards Message Brokers for Generative AI: Survey, Challenges, and
  Opportunities
Towards Message Brokers for Generative AI: Survey, Challenges, and Opportunities
Alaa Saleh
Roberto Morabito
Sasu Tarkoma
Susanna Pirttikangas
Lauri Lovén
120
4
0
22 Dec 2023
DSFormer: Effective Compression of Text-Transformers by Dense-Sparse
  Weight Factorization
DSFormer: Effective Compression of Text-Transformers by Dense-Sparse Weight Factorization
Rahul Chand
Yashoteja Prabhu
Pratyush Kumar
64
3
0
20 Dec 2023
A Survey of Reasoning with Foundation Models
A Survey of Reasoning with Foundation Models
Jiankai Sun
Chuanyang Zheng
Enze Xie
Zhengying Liu
Ruihang Chu
...
Xipeng Qiu
Yi-Chen Guo
Hui Xiong
Qun Liu
Zhenguo Li
ReLMLRMAI4CE
209
85
0
17 Dec 2023
ASVD: Activation-aware Singular Value Decomposition for Compressing
  Large Language Models
ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models
Zhihang Yuan
Yuzhang Shang
Yue Song
Qiang Wu
Yan Yan
Guangyu Sun
MQ
127
61
0
10 Dec 2023
LQ-LoRA: Low-rank Plus Quantized Matrix Decomposition for Efficient
  Language Model Finetuning
LQ-LoRA: Low-rank Plus Quantized Matrix Decomposition for Efficient Language Model Finetuning
Han Guo
P. Greengard
Eric P. Xing
Yoon Kim
MQ
148
57
0
20 Nov 2023
Differentiable Learning of Generalized Structured Matrices for Efficient
  Deep Neural Networks
Differentiable Learning of Generalized Structured Matrices for Efficient Deep Neural Networks
Changwoo Lee
Hun-Seok Kim
59
3
0
29 Oct 2023
Neural Language Model Pruning for Automatic Speech Recognition
Neural Language Model Pruning for Automatic Speech Recognition
Leonardo Emili
Thiago Fraga-Silva
Ernest Pusateri
M. Nußbaum-Thom
Youssef Oualil
83
1
0
05 Oct 2023
LORD: Low Rank Decomposition Of Monolingual Code LLMs For One-Shot
  Compression
LORD: Low Rank Decomposition Of Monolingual Code LLMs For One-Shot Compression
Ayush Kaushal
Tejas Vaidhya
Irina Rish
129
16
0
25 Sep 2023
Incrementally-Computable Neural Networks: Efficient Inference for
  Dynamic Inputs
Incrementally-Computable Neural Networks: Efficient Inference for Dynamic Inputs
Or Sharir
Anima Anandkumar
60
0
0
27 Jul 2023
Does Circuit Analysis Interpretability Scale? Evidence from Multiple
  Choice Capabilities in Chinchilla
Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Tom Lieberum
Matthew Rahtz
János Kramár
Neel Nanda
G. Irving
Rohin Shah
Vladimir Mikulik
103
115
0
18 Jul 2023
Low-Rank Prune-And-Factorize for Language Model Compression
Low-Rank Prune-And-Factorize for Language Model Compression
Siyu Ren
Kenny Q. Zhu
95
9
0
25 Jun 2023
LoSparse: Structured Compression of Large Language Models based on
  Low-Rank and Sparse Approximation
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation
Yixiao Li
Yifan Yu
Qingru Zhang
Chen Liang
Pengcheng He
Weizhu Chen
Tuo Zhao
132
76
0
20 Jun 2023
Efficient Alternating Minimization with Applications to Weighted Low
  Rank Approximation
Efficient Alternating Minimization with Applications to Weighted Low Rank Approximation
Zhao Song
Mingquan Ye
Junze Yin
Licheng Zhang
59
7
0
07 Jun 2023
COMCAT: Towards Efficient Compression and Customization of
  Attention-Based Vision Models
COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models
Jinqi Xiao
Miao Yin
Yu Gong
Xiao Zang
Jian Ren
Bo Yuan
VLMViT
134
9
0
26 May 2023
Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech
  Recognition Models
Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech Recognition Models
Steven M. Hernandez
Ding Zhao
Shaojin Ding
A. Bruguier
Rohit Prabhavalkar
Tara N. Sainath
Yanzhang He
Ian McGraw
112
9
0
15 Mar 2023
TrojText: Test-time Invisible Textual Trojan Insertion
TrojText: Test-time Invisible Textual Trojan Insertion
Qiang Lou
Ye Liu
Bo Feng
145
27
0
03 Mar 2023
Revisiting Offline Compression: Going Beyond Factorization-based Methods
  for Transformer Language Models
Revisiting Offline Compression: Going Beyond Factorization-based Methods for Transformer Language Models
Mohammadreza Banaei
Klaudia Bałazy
Artur Kasymov
R. Lebret
Jacek Tabor
Karl Aberer
OffRL
52
0
0
08 Feb 2023
Tensor Networks Meet Neural Networks: A Survey and Future Perspectives
Tensor Networks Meet Neural Networks: A Survey and Future Perspectives
Maolin Wang
Yu Pan
Zenglin Xu
Xiangli Yang
Guangxi Li
A. Cichocki
Andrzej Cichocki
210
22
0
22 Jan 2023
Is Smaller Always Faster? Tradeoffs in Compressing Self-Supervised Speech Transformers
Is Smaller Always Faster? Tradeoffs in Compressing Self-Supervised Speech Transformers
Tzu-Quan Lin
Tsung-Huan Yang
Chun-Yao Chang
Kuang-Ming Chen
Tzu-hsun Feng
Hung-yi Lee
Hao Tang
84
6
0
17 Nov 2022
Numerical Optimizations for Weighted Low-rank Estimation on Language
  Model
Numerical Optimizations for Weighted Low-rank Estimation on Language Model
Ting Hua
Yen-Chang Hsu
Felicity Wang
Qiang Lou
Yilin Shen
Hongxia Jin
89
15
0
02 Nov 2022
TrojViT: Trojan Insertion in Vision Transformers
TrojViT: Trojan Insertion in Vision Transformers
Mengxin Zheng
Qian Lou
Lei Jiang
178
56
0
27 Aug 2022
Survey on Evolutionary Deep Learning: Principles, Algorithms,
  Applications and Open Issues
Survey on Evolutionary Deep Learning: Principles, Algorithms, Applications and Open Issues
Nan Li
Lianbo Ma
Guo-Ding Yu
Bing Xue
Mengjie Zhang
Yaochu Jin
83
79
0
23 Aug 2022
Rank Diminishing in Deep Neural Networks
Rank Diminishing in Deep Neural Networks
Ruili Feng
Kecheng Zheng
Yukun Huang
Deli Zhao
Michael I. Jordan
Zhengjun Zha
100
33
0
13 Jun 2022
Nonlinear Initialization Methods for Low-Rank Neural Networks
Nonlinear Initialization Methods for Low-Rank Neural Networks
Kiran Vodrahalli
Rakesh Shivanna
M. Sathiamoorthy
Sagar Jain
Ed H. Chi
83
4
0
02 Feb 2022
Previous
12