Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.18921
Cited By
v1
v2 (latest)
Mobile Edge Intelligence for Large Language Models: A Contemporary Survey
9 July 2024
Guanqiao Qu
Qiyuan Chen
Wei Wei
Zheng Lin
Xianhao Chen
Kaibin Huang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Mobile Edge Intelligence for Large Language Models: A Contemporary Survey"
50 / 211 papers shown
Title
FedAC: An Adaptive Clustered Federated Learning Framework for Heterogeneous Data
Yuxin Zhang
Haoyu Chen
Zheng Lin
Zhe Chen
Jin Zhao
FedML
95
21
0
25 Mar 2024
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Zeyu Han
Chao Gao
Jinyang Liu
Jeff Zhang
Sai Qian Zhang
300
403
0
21 Mar 2024
AdaptSFL: Adaptive Split Federated Learning in Resource-constrained Edge Networks
Zhengyi Lin
Guanqiao Qu
Wei Wei
Xianhao Chen
Kin K. Leung
126
51
0
19 Mar 2024
Teach LLMs to Phish: Stealing Private Information from Language Models
Ashwinee Panda
Christopher A. Choquette-Choo
Zhengming Zhang
Yaoqing Yang
Prateek Mittal
PILM
110
26
0
01 Mar 2024
LLM Inference Unveiled: Survey and Roofline Model Insights
Zhihang Yuan
Yuzhang Shang
Yang Zhou
Zhen Dong
Zhe Zhou
...
Yong Jae Lee
Yan Yan
Beidi Chen
Guangyu Sun
Kurt Keutzer
233
91
0
26 Feb 2024
ESFL: Efficient Split Federated Learning over Resource-Constrained Heterogeneous Wireless Devices
Guangyu Zhu
Yiqin Deng
Xianhao Chen
Haixia Zhang
Yuguang Fang
Tan F. Wong
FedML
61
10
0
24 Feb 2024
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
Ziheng Jiang
Yanghua Peng
Yinmin Zhong
Qi Huang
Yangrui Chen
...
Zhe Li
X. Jia
Jia-jun Ye
Xin Jin
Xin Liu
LRM
126
122
0
23 Feb 2024
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Zechun Liu
Changsheng Zhao
Forrest N. Iandola
Chen Lai
Yuandong Tian
...
Ernie Chang
Yangyang Shi
Raghuraman Krishnamoorthi
Liangzhen Lai
Vikas Chandra
ALM
137
102
0
22 Feb 2024
A Survey on Knowledge Distillation of Large Language Models
Xiaohan Xu
Ming Li
Chongyang Tao
Tao Shen
Reynold Cheng
Jinyang Li
Can Xu
Dacheng Tao
Dinesh Manocha
KELM
VLM
173
133
0
20 Feb 2024
Large Language Models: A Survey
Shervin Minaee
Tomas Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALM
LM&MA
ELM
246
425
0
09 Feb 2024
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
Zirui Liu
Jiayi Yuan
Hongye Jin
Shaochen Zhong
Zhaozhuo Xu
Vladimir Braverman
Beidi Chen
Helen Zhou
MQ
111
204
0
05 Feb 2024
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Yichao Fu
Peter Bailis
Ion Stoica
Hao Zhang
202
164
0
03 Feb 2024
A Survey on Generative AI and LLM for Video Generation, Understanding, and Streaming
Pengyuan Zhou
Lin Wang
Zhi Liu
Yanbin Hao
Pan Hui
Sasu Tarkoma
J. Kangasharju
VGen
112
30
0
30 Jan 2024
Large Multi-Modal Models (LMMs) as Universal Foundation Models for AI-Native Wireless Systems
Shengzhe Xu
Christo Kurisummoottil Thomas
Omar Hashash
Nikhil Muralidhar
Walid Saad
Naren Ramakrishnan
92
26
0
30 Jan 2024
ServerlessLLM: Locality-Enhanced Serverless Inference for Large Language Models
Yao Fu
Leyang Xue
Yeqi Huang
Andrei-Octavian Brabete
Dmitrii Ustiugov
Yuvraj Patel
Luo Mai
72
6
0
25 Jan 2024
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Tianle Cai
Yuhong Li
Zhengyang Geng
Hongwu Peng
Jason D. Lee
De-huai Chen
Tri Dao
189
314
0
19 Jan 2024
A Survey on Hardware Accelerators for Large Language Models
C. Kachris
74
15
0
18 Jan 2024
When Large Language Model Agents Meet 6G Networks: Perception, Grounding, and Alignment
Minrui Xu
Dusit Niyato
Jiawen Kang
Zehui Xiong
Shiwen Mao
Zhu Han
Dong In Kim
K. B. Letaief
LLMAG
103
45
0
15 Jan 2024
Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security
Yuanchun Li
Hao Wen
Weijun Wang
Xiangyu Li
Yizhen Yuan
...
Zhijun Li
Peng Li
Yang Liu
Yaqiong Zhang
Yunxin Liu
LLMAG
102
190
0
10 Jan 2024
FFSplit: Split Feed-Forward Network For Optimizing Accuracy-Efficiency Trade-off in Language Model Inference
Zirui Liu
Qingquan Song
Q. Xiao
Sathiya Keerthi Selvaraj
Rahul Mazumder
Aman Gupta
Helen Zhou
81
4
0
08 Jan 2024
Training and Serving System of Foundation Models: A Comprehensive Survey
Jiahang Zhou
Yanyu Chen
Zicong Hong
Wuhui Chen
Yue Yu
Tao Zhang
Hui Wang
Chuan-fu Zhang
Zibin Zheng
ALM
94
11
0
05 Jan 2024
Understanding LLMs: A Comprehensive Overview from Training to Inference
Yi-Hsueh Liu
Haoyang He
Tianle Han
Xu-Yao Zhang
Mengyuan Liu
...
Xintao Hu
Tuo Zhang
Ning Qiang
Tianming Liu
Bao Ge
SyDa
157
79
0
04 Jan 2024
Collaborative Perception for Connected and Autonomous Driving: Challenges, Possible Solutions and Opportunities
Senkang Hu
Zhengru Fang
Yiqin Deng
Xianhao Chen
Yuguang Fang
119
29
0
03 Jan 2024
Cloud-Device Collaborative Learning for Multimodal Large Language Models
Guanqun Wang
Jiaming Liu
Chenxuan Li
Junpeng Ma
Yuan Zhang
...
Kevin Zhang
Maurice Chong
Ray Zhang
Yijiang Liu
Shanghang Zhang
109
8
0
26 Dec 2023
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
Xupeng Miao
Gabriele Oliaro
Zhihao Zhang
Xinhao Cheng
Hongyi Jin
Tianqi Chen
Zhihao Jia
138
86
0
23 Dec 2023
SPT: Fine-Tuning Transformer-based Language Models Efficiently with Sparsification
Yuntao Gui
Xiao Yan
Peiqi Yin
Han Yang
James Cheng
89
2
0
16 Dec 2023
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
Yixin Song
Zeyu Mi
Haotong Xie
Haibo Chen
BDL
178
135
0
16 Dec 2023
DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving
Wenhai Wang
Jiangwei Xie
ChuanYang Hu
Haoming Zou
Jianan Fan
...
Lewei Lu
Xizhou Zhu
Xiaogang Wang
Yu Qiao
Jifeng Dai
92
146
0
14 Dec 2023
Fewer is More: Boosting LLM Reasoning with Reinforced Context Pruning
Xijie Huang
Li Lyna Zhang
Kwang-Ting Cheng
Fan Yang
Mao Yang
LRM
ReLM
94
13
0
14 Dec 2023
Distributed Inference and Fine-tuning of Large Language Models Over The Internet
Alexander Borzunov
Max Ryabinin
Artem Chumachenko
Dmitry Baranchuk
Tim Dettmers
Younes Belkada
Pavel Samygin
Colin Raffel
MoE
ALM
65
42
0
13 Dec 2023
SparQ Attention: Bandwidth-Efficient LLM Inference
Luka Ribar
Ivan Chelombiev
Luke Hudlass-Galley
Charlie Blake
Carlo Luschi
Douglas Orr
156
54
0
08 Dec 2023
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism
Yanxi Chen
Xuchen Pan
Yaliang Li
Bolin Ding
Jingren Zhou
LRM
101
33
0
08 Dec 2023
Efficient Large Language Models: A Survey
Zhongwei Wan
Xin Wang
Che Liu
Samiul Alam
Yu Zheng
...
Shen Yan
Yi Zhu
Quanlu Zhang
Mosharaf Chowdhury
Mi Zhang
LM&MA
81
137
0
06 Dec 2023
Green Edge AI: A Contemporary Survey
Yuyi Mao
X. Yu
Kaibin Huang
Ying-Jun Angela Zhang
Jun Zhang
127
21
0
01 Dec 2023
Multimodal Large Language Models: A Survey
Jiayang Wu
Wensheng Gan
Zefeng Chen
Shicheng Wan
Philip S. Yu
95
195
0
22 Nov 2023
A Survey on Multimodal Large Language Models for Autonomous Driving
Can Cui
Yunsheng Ma
Xu Cao
Wenqian Ye
Yang Zhou
...
Xinrui Yan
Shuqi Mei
Jianguo Cao
Ziran Wang
Chao Zheng
169
290
0
21 Nov 2023
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
Peng Jin
Ryuichi Takanobu
Caiwan Zhang
Xiaochun Cao
Li-ming Yuan
MLLM
140
249
0
14 Nov 2023
On the Opportunities of Green Computing: A Survey
You Zhou
Xiujing Lin
Xiang Zhang
Maolin Wang
Gangwei Jiang
...
Chao Mou
Shuai Han
Wuxia Jin
Guannan Zhang
Xiaodong Zeng
79
10
0
01 Nov 2023
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Zichang Liu
Jue Wang
Tri Dao
Dinesh Manocha
Binhang Yuan
...
Anshumali Shrivastava
Ce Zhang
Yuandong Tian
Christopher Ré
Beidi Chen
BDL
123
221
0
26 Oct 2023
Federated Learning of Large Language Models with Parameter-Efficient Prompt Tuning and Adaptive Optimization
Tianshi Che
Ji Liu
Yang Zhou
Jiaxiang Ren
Jiwen Zhou
Victor S. Sheng
H. Dai
Dejing Dou
96
56
0
23 Oct 2023
Λ
Λ
Λ
-Split: A Privacy-Preserving Split Computing Framework for Cloud-Powered Generative AI
Shoki Ohta
Takayuki Nishio
146
5
0
23 Oct 2023
FATE-LLM: A Industrial Grade Federated Learning Framework for Large Language Models
Tao Fan
Yan Kang
Guoqiang Ma
Weijing Chen
Wenbin Wei
Lixin Fan
Qiang Yang
97
65
0
16 Oct 2023
QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models
Jing Liu
Ruihao Gong
Xiuying Wei
Zhiwei Dong
Jianfei Cai
Bohan Zhuang
MQ
88
53
0
12 Oct 2023
Compressing Context to Enhance Inference Efficiency of Large Language Models
Yucheng Li
Bo Dong
Chenghua Lin
Frank Guerin
63
73
0
09 Oct 2023
LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models
Huiqiang Jiang
Qianhui Wu
Chin-Yew Lin
Yuqing Yang
Lili Qiu
109
118
0
09 Oct 2023
Federated Fine-Tuning of LLMs on the Very Edge: The Good, the Bad, the Ugly
Herbert Woisetschläger
Alexander Erben
Shiqiang Wang
R. Mayer
Hans-Arno Jacobsen
FedML
91
20
0
04 Oct 2023
BTR: Binary Token Representations for Efficient Retrieval Augmented Language Models
Qingqing Cao
Sewon Min
Yizhong Wang
Hannaneh Hajishirzi
MQ
RALM
81
4
0
02 Oct 2023
Efficient Streaming Language Models with Attention Sinks
Michel Lang
Yuandong Tian
Beidi Chen
Song Han
Mike Lewis
AI4TS
RALM
165
791
0
29 Sep 2023
Pushing Large Language Models to the 6G Edge: Vision, Challenges, and Opportunities
Zhengyi Lin
Guanqiao Qu
Qiyuan Chen
Randy Sarayar
Zhe Chen
Kaibin Huang
105
98
0
28 Sep 2023
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Wenhua Cheng
Weiwei Zhang
Haihao Shen
Yiyang Cai
Xin He
Kaokao Lv
Yi. Liu
MQ
160
25
0
11 Sep 2023
Previous
1
2
3
4
5
Next