ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.07633
  4. Cited By
A Survey on Model Compression for Large Language Models

A Survey on Model Compression for Large Language Models

15 August 2023
Xunyu Zhu
Jian Li
Yong Liu
Can Ma
Weiping Wang
ArXivPDFHTML

Papers citing "A Survey on Model Compression for Large Language Models"

43 / 143 papers shown
Title
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Saleh Ashkboos
Maximilian L. Croci
Marcelo Gennari do Nascimento
Torsten Hoefler
James Hensman
VLM
132
145
0
26 Jan 2024
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric
  Algorithm-System Co-Design
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design
Haojun Xia
Zhen Zheng
Xiaoxia Wu
Shiyang Chen
Zhewei Yao
...
Donglin Zhuang
Zhongzhu Zhou
Olatunji Ruwase
Yuxiong He
Shuaiwen Leon Song
MQ
35
14
0
25 Jan 2024
Distilling Mathematical Reasoning Capabilities into Small Language
  Models
Distilling Mathematical Reasoning Capabilities into Small Language Models
Xunyu Zhu
Jian Li
Yong Liu
Can Ma
Weiping Wang
LRM
32
9
0
22 Jan 2024
IoT in the Era of Generative AI: Vision and Challenges
IoT in the Era of Generative AI: Vision and Challenges
Xin Wang
Zhongwei Wan
Arvin Hekmati
M. Zong
Samiul Alam
Mi Zhang
Bhaskar Krishnamachari
32
15
0
03 Jan 2024
LLMs with User-defined Prompts as Generic Data Operators for Reliable
  Data Processing
LLMs with User-defined Prompts as Generic Data Operators for Reliable Data Processing
Luyi Ma
Nikhil Thakurdesai
Jiaoayan Chen
Jianpeng Xu
Evren Körpeoglu
Sushant Kumar
Kannan Achan
AI4CE
19
0
0
26 Dec 2023
Towards Message Brokers for Generative AI: Survey, Challenges, and
  Opportunities
Towards Message Brokers for Generative AI: Survey, Challenges, and Opportunities
Alaa Saleh
Roberto Morabito
Sasu Tarkoma
Susanna Pirttikangas
Lauri Lovén
58
3
0
22 Dec 2023
A Performance Evaluation of a Quantized Large Language Model on Various
  Smartphones
A Performance Evaluation of a Quantized Large Language Model on Various Smartphones
Tolga Çöplü
Marc Loedi
Arto Bendiken
Mykhailo Makohin
Joshua J. Bouw
Stephen Cobb
MQ
16
5
0
19 Dec 2023
Large Language Models Empowered Agent-based Modeling and Simulation: A
  Survey and Perspectives
Large Language Models Empowered Agent-based Modeling and Simulation: A Survey and Perspectives
Chen Gao
Xiaochong Lan
Nian Li
Yuan Yuan
Jingtao Ding
Zhilun Zhou
Fengli Xu
Yong Li
LLMAG
AI4CE
LM&Ro
41
103
0
19 Dec 2023
Urban Generative Intelligence (UGI): A Foundational Platform for Agents
  in Embodied City Environment
Urban Generative Intelligence (UGI): A Foundational Platform for Agents in Embodied City Environment
Fengli Xu
Jun Zhang
Chen Gao
J. Feng
Yong Li
AI4CE
LLMAG
24
28
0
19 Dec 2023
ASVD: Activation-aware Singular Value Decomposition for Compressing
  Large Language Models
ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models
Zhihang Yuan
Yuzhang Shang
Yue Song
Qiang Wu
Yan Yan
Guangyu Sun
MQ
32
42
0
10 Dec 2023
Green Edge AI: A Contemporary Survey
Green Edge AI: A Contemporary Survey
Yuyi Mao
X. Yu
Kaibin Huang
Ying-Jun Angela Zhang
Jun Zhang
33
17
0
01 Dec 2023
Robot Learning in the Era of Foundation Models: A Survey
Robot Learning in the Era of Foundation Models: A Survey
Xuan Xiao
Jiahang Liu
Zhipeng Wang
Yanmin Zhou
Yong Qi
Qian Cheng
Bin He
Shuo Jiang
AI4CE
LM&Ro
23
26
0
24 Nov 2023
Applications of Large Scale Foundation Models for Autonomous Driving
Applications of Large Scale Foundation Models for Autonomous Driving
Yu Huang
Yue Chen
Zhu Li
ELM
AI4CE
LRM
ALM
LM&Ro
61
15
0
20 Nov 2023
A Speed Odyssey for Deployable Quantization of LLMs
A Speed Odyssey for Deployable Quantization of LLMs
Qingyuan Li
Ran Meng
Yiduo Li
Bo-Wen Zhang
Liang Li
Yifan Lu
Xiangxiang Chu
Yerui Sun
Yuchen Xie
MQ
56
7
0
16 Nov 2023
Sequencing Matters: A Generate-Retrieve-Generate Model for Building
  Conversational Agents
Sequencing Matters: A Generate-Retrieve-Generate Model for Building Conversational Agents
Quinn Patwardhan
Grace Hui Yang
13
0
0
16 Nov 2023
Towards the Law of Capacity Gap in Distilling Language Models
Towards the Law of Capacity Gap in Distilling Language Models
Chen Zhang
Dawei Song
Zheyu Ye
Yan Gao
ELM
30
20
0
13 Nov 2023
Attribution Patching Outperforms Automated Circuit Discovery
Attribution Patching Outperforms Automated Circuit Discovery
Aaquib Syed
Can Rager
Arthur Conmy
68
56
0
16 Oct 2023
ReLU Strikes Back: Exploiting Activation Sparsity in Large Language
  Models
ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models
Iman Mirzadeh
Keivan Alizadeh-Vahid
Sachin Mehta
C. C. D. Mundo
Oncel Tuzel
Golnoosh Samei
Mohammad Rastegari
Mehrdad Farajtabar
123
60
0
06 Oct 2023
PB-LLM: Partially Binarized Large Language Models
PB-LLM: Partially Binarized Large Language Models
Yuzhang Shang
Zhihang Yuan
Qiang Wu
Zhen Dong
MQ
13
43
0
29 Sep 2023
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Yuhui Xu
Lingxi Xie
Xiaotao Gu
Xin Chen
Heng Chang
Hengheng Zhang
Zhensu Chen
Xiaopeng Zhang
Qi Tian
MQ
13
89
0
26 Sep 2023
LORD: Low Rank Decomposition Of Monolingual Code LLMs For One-Shot
  Compression
LORD: Low Rank Decomposition Of Monolingual Code LLMs For One-Shot Compression
Ayush Kaushal
Tejas Vaidhya
Irina Rish
52
15
0
25 Sep 2023
A Survey on Image-text Multimodal Models
A Survey on Image-text Multimodal Models
Ruifeng Guo
Jingxuan Wei
Linzhuang Sun
Khai Le-Duc
Guiyong Chang
Dawei Liu
Sibo Zhang
Zhengbing Yao
Mingjun Xu
Liping Bu
VLM
31
5
0
23 Sep 2023
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large
  Language Models for Dynamic Inference
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference
Parsa Kavehzadeh
Mojtaba Valipour
Marzieh S. Tahaei
Ali Ghodsi
Boxing Chen
Mehdi Rezagholizadeh
33
6
0
16 Sep 2023
MixBCT: Towards Self-Adapting Backward-Compatible Training
MixBCT: Towards Self-Adapting Backward-Compatible Training
Yuefeng Liang
Yufeng Zhang
Shiliang Zhang
Yaowei Wang
Shengze Xiao
KenLi Li
Xiaoyu Wang
24
1
0
14 Aug 2023
Large Language Models and Foundation Models in Smart Agriculture:
  Basics, Opportunities, and Challenges
Large Language Models and Foundation Models in Smart Agriculture: Basics, Opportunities, and Challenges
Jiajia Li
Mingle Xu
Lirong Xiang
Dong Chen
Weichao Zhuang
Xunyuan Yin
Zhao Li
36
3
0
13 Aug 2023
A Comprehensive Overview of Large Language Models
A Comprehensive Overview of Large Language Models
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Ajmal Saeed Mian
OffRL
70
525
0
12 Jul 2023
Structural Pruning for Diffusion Models
Structural Pruning for Diffusion Models
Gongfan Fang
Xinyin Ma
Xinchao Wang
25
127
0
18 May 2023
Distilling Step-by-Step! Outperforming Larger Language Models with Less
  Training Data and Smaller Model Sizes
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
Lokesh Nagalapatti
Chun-Liang Li
Chih-Kuan Yeh
Hootan Nakhost
Yasuhisa Fujii
Alexander Ratner
Ranjay Krishna
Chen-Yu Lee
Tomas Pfister
ALM
220
499
0
03 May 2023
SCOTT: Self-Consistent Chain-of-Thought Distillation
SCOTT: Self-Consistent Chain-of-Thought Distillation
Jamie Yap
Zhengyang Wang
Zheng Li
K. Lynch
Bing Yin
Xiang Ren
LRM
61
93
0
03 May 2023
Towards Automated Circuit Discovery for Mechanistic Interpretability
Towards Automated Circuit Discovery for Mechanistic Interpretability
Arthur Conmy
Augustine N. Mavor-Parker
Aengus Lynch
Stefan Heimersheim
Adrià Garriga-Alonso
23
279
0
28 Apr 2023
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale
  Instructions
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
Minghao Wu
Abdul Waheed
Chiyu Zhang
Muhammad Abdul-Mageed
Alham Fikri Aji
ALM
132
119
0
27 Apr 2023
ZeroQuant-V2: Exploring Post-training Quantization in LLMs from
  Comprehensive Study to Low Rank Compensation
ZeroQuant-V2: Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation
Z. Yao
Xiaoxia Wu
Cheng-rong Li
Stephen Youn
Yuxiong He
MQ
63
57
0
15 Mar 2023
DepGraph: Towards Any Structural Pruning
DepGraph: Towards Any Structural Pruning
Gongfan Fang
Xinyin Ma
Mingli Song
Michael Bi Mi
Xinchao Wang
GNN
91
257
0
30 Jan 2023
Language Models are Multilingual Chain-of-Thought Reasoners
Language Models are Multilingual Chain-of-Thought Reasoners
Freda Shi
Mirac Suzgun
Markus Freitag
Xuezhi Wang
Suraj Srivats
...
Yi Tay
Sebastian Ruder
Denny Zhou
Dipanjan Das
Jason W. Wei
ReLM
LRM
172
327
0
06 Oct 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
314
3,248
0
21 Mar 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
313
11,953
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
367
8,495
0
28 Jan 2022
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for
  Code Understanding and Generation
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
Yue Wang
Weishi Wang
Shafiq R. Joty
S. Hoi
235
1,489
0
02 Sep 2021
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit
  Reasoning Strategies
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
Mor Geva
Daniel Khashabi
Elad Segal
Tushar Khot
Dan Roth
Jonathan Berant
RALM
250
673
0
06 Jan 2021
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
243
4,469
0
23 Jan 2020
Language Models as Knowledge Bases?
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
415
2,586
0
03 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
Neural Architecture Search with Reinforcement Learning
Neural Architecture Search with Reinforcement Learning
Barret Zoph
Quoc V. Le
271
5,329
0
05 Nov 2016
Previous
123