Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.01799
Cited By
Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward
2 February 2024
Arnav Chavan
Raghav Magazine
Shubham Kushwaha
M. Debbah
Deepak Gupta
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward"
17 / 17 papers shown
Title
From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap
Gopi Krishnan Rajbahadur
G. Oliva
Dayi Lin
Ahmed E. Hassan
96
1
0
28 Jan 2025
Grammar-based Game Description Generation using Large Language Models
Tsunehiko Tanaka
Edgar Simo-Serra
92
2
0
24 Jul 2024
Fluctuation-based Adaptive Structured Pruning for Large Language Models
Yongqi An
Xu Zhao
Tao Yu
Ming Tang
Jinqiao Wang
91
52
0
19 Dec 2023
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models
Wenqi Shao
Mengzhao Chen
Zhaoyang Zhang
Peng Xu
Lirui Zhao
Zhiqiang Li
Kaipeng Zhang
Peng Gao
Yu Qiao
Ping Luo
MQ
76
193
0
25 Aug 2023
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation
Yixiao Li
Yifan Yu
Qingru Zhang
Chen Liang
Pengcheng He
Weizhu Chen
Tuo Zhao
98
73
0
20 Jun 2023
Lion: Adversarial Distillation of Proprietary Large Language Models
Yuxin Jiang
Chunkit Chan
Yin Hua
Wei Wang
ALM
62
25
0
22 May 2023
Fast Inference from Transformers via Speculative Decoding
Yaniv Leviathan
Matan Kalman
Yossi Matias
LRM
105
701
0
30 Nov 2022
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Tim Dettmers
M. Lewis
Younes Belkada
Luke Zettlemoyer
MQ
78
649
0
15 Aug 2022
Compression of Generative Pre-trained Language Models via Quantization
Chaofan Tao
Lu Hou
Wei Zhang
Lifeng Shang
Xin Jiang
Qun Liu
Ping Luo
Ngai Wong
MQ
60
104
0
21 Mar 2022
Vision Transformer Slimming: Multi-Dimension Searching in Continuous Optimization Space
Arnav Chavan
Zhiqiang Shen
Zhuang Liu
Zechun Liu
Kwang-Ting Cheng
Eric P. Xing
ViT
81
70
0
03 Jan 2022
Learned Step Size Quantization
S. K. Esser
J. McKinstry
Deepika Bablani
R. Appuswamy
D. Modha
MQ
71
802
0
21 Feb 2019
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Jonathan Frankle
Michael Carbin
225
3,461
0
09 Mar 2018
Learning Efficient Convolutional Networks through Network Slimming
Zhuang Liu
Jianguo Li
Zhiqiang Shen
Gao Huang
Shoumeng Yan
Changshui Zhang
122
2,419
0
22 Aug 2017
Pruning Filters for Efficient ConvNets
Hao Li
Asim Kadav
Igor Durdanovic
H. Samet
H. Graf
3DPC
188
3,693
0
31 Aug 2016
Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition
V. Lebedev
Yaroslav Ganin
M. Rakhuba
Ivan Oseledets
Victor Lempitsky
61
884
0
19 Dec 2014
Speeding up Convolutional Neural Networks with Low Rank Expansions
Max Jaderberg
Andrea Vedaldi
Andrew Zisserman
128
1,463
0
15 May 2014
Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation
Emily L. Denton
Wojciech Zaremba
Joan Bruna
Yann LeCun
Rob Fergus
FAtt
177
1,689
0
02 Apr 2014
1