ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.01799
  4. Cited By
Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward

Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward

2 February 2024
Arnav Chavan
Raghav Magazine
Shubham Kushwaha
M. Debbah
Deepak Gupta
ArXivPDFHTML

Papers citing "Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward"

17 / 17 papers shown
Title
From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap
From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap
Gopi Krishnan Rajbahadur
G. Oliva
Dayi Lin
Ahmed E. Hassan
96
1
0
28 Jan 2025
Grammar-based Game Description Generation using Large Language Models
Grammar-based Game Description Generation using Large Language Models
Tsunehiko Tanaka
Edgar Simo-Serra
92
2
0
24 Jul 2024
Fluctuation-based Adaptive Structured Pruning for Large Language Models
Fluctuation-based Adaptive Structured Pruning for Large Language Models
Yongqi An
Xu Zhao
Tao Yu
Ming Tang
Jinqiao Wang
91
52
0
19 Dec 2023
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language
  Models
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models
Wenqi Shao
Mengzhao Chen
Zhaoyang Zhang
Peng Xu
Lirui Zhao
Zhiqiang Li
Kaipeng Zhang
Peng Gao
Yu Qiao
Ping Luo
MQ
76
193
0
25 Aug 2023
LoSparse: Structured Compression of Large Language Models based on
  Low-Rank and Sparse Approximation
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation
Yixiao Li
Yifan Yu
Qingru Zhang
Chen Liang
Pengcheng He
Weizhu Chen
Tuo Zhao
98
73
0
20 Jun 2023
Lion: Adversarial Distillation of Proprietary Large Language Models
Lion: Adversarial Distillation of Proprietary Large Language Models
Yuxin Jiang
Chunkit Chan
Yin Hua
Wei Wang
ALM
62
25
0
22 May 2023
Fast Inference from Transformers via Speculative Decoding
Fast Inference from Transformers via Speculative Decoding
Yaniv Leviathan
Matan Kalman
Yossi Matias
LRM
105
701
0
30 Nov 2022
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Tim Dettmers
M. Lewis
Younes Belkada
Luke Zettlemoyer
MQ
78
649
0
15 Aug 2022
Compression of Generative Pre-trained Language Models via Quantization
Compression of Generative Pre-trained Language Models via Quantization
Chaofan Tao
Lu Hou
Wei Zhang
Lifeng Shang
Xin Jiang
Qun Liu
Ping Luo
Ngai Wong
MQ
60
104
0
21 Mar 2022
Vision Transformer Slimming: Multi-Dimension Searching in Continuous
  Optimization Space
Vision Transformer Slimming: Multi-Dimension Searching in Continuous Optimization Space
Arnav Chavan
Zhiqiang Shen
Zhuang Liu
Zechun Liu
Kwang-Ting Cheng
Eric P. Xing
ViT
81
70
0
03 Jan 2022
Learned Step Size Quantization
Learned Step Size Quantization
S. K. Esser
J. McKinstry
Deepika Bablani
R. Appuswamy
D. Modha
MQ
71
802
0
21 Feb 2019
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Jonathan Frankle
Michael Carbin
225
3,461
0
09 Mar 2018
Learning Efficient Convolutional Networks through Network Slimming
Learning Efficient Convolutional Networks through Network Slimming
Zhuang Liu
Jianguo Li
Zhiqiang Shen
Gao Huang
Shoumeng Yan
Changshui Zhang
122
2,419
0
22 Aug 2017
Pruning Filters for Efficient ConvNets
Pruning Filters for Efficient ConvNets
Hao Li
Asim Kadav
Igor Durdanovic
H. Samet
H. Graf
3DPC
188
3,693
0
31 Aug 2016
Speeding-up Convolutional Neural Networks Using Fine-tuned
  CP-Decomposition
Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition
V. Lebedev
Yaroslav Ganin
M. Rakhuba
Ivan Oseledets
Victor Lempitsky
61
884
0
19 Dec 2014
Speeding up Convolutional Neural Networks with Low Rank Expansions
Speeding up Convolutional Neural Networks with Low Rank Expansions
Max Jaderberg
Andrea Vedaldi
Andrew Zisserman
128
1,463
0
15 May 2014
Exploiting Linear Structure Within Convolutional Networks for Efficient
  Evaluation
Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation
Emily L. Denton
Wojciech Zaremba
Joan Bruna
Yann LeCun
Rob Fergus
FAtt
177
1,689
0
02 Apr 2014
1