ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.14636
  4. Cited By
PerLLM: Personalized Inference Scheduling with Edge-Cloud Collaboration
  for Diverse LLM Services

PerLLM: Personalized Inference Scheduling with Edge-Cloud Collaboration for Diverse LLM Services

23 May 2024
Zheming Yang
Yuanhao Yang
Chang Zhao
Qi Guo
Wenkai He
Wen Ji
ArXivPDFHTML

Papers citing "PerLLM: Personalized Inference Scheduling with Edge-Cloud Collaboration for Diverse LLM Services"

8 / 8 papers shown
Title
Task-Oriented Semantic Communication in Large Multimodal Models-based Vehicle Networks
Task-Oriented Semantic Communication in Large Multimodal Models-based Vehicle Networks
Baoxia Du
H. Du
Dusit Niyato
Ruidong Li
58
0
0
05 May 2025
Taming the Titans: A Survey of Efficient LLM Inference Serving
Taming the Titans: A Survey of Efficient LLM Inference Serving
Ranran Zhen
J. Li
Yixin Ji
Zhengyuan Yang
Tong Liu
Qingrong Xia
Xinyu Duan
Zehao Wang
Baoxing Huai
Hao Fei
LLMAG
77
0
0
28 Apr 2025
DeServe: Towards Affordable Offline LLM Inference via Decentralization
Linyu Wu
Xiaoyuan Liu
Tianneng Shi
Zhe Ye
D. Song
OffRL
42
0
0
28 Jan 2025
SPA: Towards A Computational Friendly Cloud-Base and On-Devices
  Collaboration Seq2seq Personalized Generation
SPA: Towards A Computational Friendly Cloud-Base and On-Devices Collaboration Seq2seq Personalized Generation
Yanming Liu
Xinyue Peng
Jiannan Cao
Le Dai
Xingzu Liu
Mingbang Wang
Weihao Liu
SyDa
44
2
0
11 Mar 2024
A Survey on Effective Invocation Methods of Massive LLM Services
A Survey on Effective Invocation Methods of Massive LLM Services
Can Wang
Bolin Zhang
Dianbo Sui
Zhiying Tu
Xiaoyu Liu
Jiabao Kang
34
6
0
05 Feb 2024
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Jingfeng Yang
Hongye Jin
Ruixiang Tang
Xiaotian Han
Qizhang Feng
Haoming Jiang
Bing Yin
Xia Hu
LM&MA
137
626
0
26 Apr 2023
FlexGen: High-Throughput Generative Inference of Large Language Models
  with a Single GPU
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
Ying Sheng
Lianmin Zheng
Binhang Yuan
Zhuohan Li
Max Ryabinin
...
Joseph E. Gonzalez
Percy Liang
Christopher Ré
Ion Stoica
Ce Zhang
149
369
0
13 Mar 2023
Transformer in Transformer
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
289
1,524
0
27 Feb 2021
1