ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.00066
  4. Cited By
Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead

Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead

17 June 2024
Rickard Brüel-Gabrielsson
Jiacheng Zhu
Onkar Bhardwaj
Leshem Choshen
Kristjan Greenewald
Mikhail Yurochkin
Justin Solomon
ArXivPDFHTML

Papers citing "Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead"

2 / 2 papers shown
Title
RaSA: Rank-Sharing Low-Rank Adaptation
RaSA: Rank-Sharing Low-Rank Adaptation
Zhiwei He
Zhaopeng Tu
Xing Wang
Xingyu Chen
Zekun Wang
Jiahao Xu
Tian Liang
Wenxiang Jiao
Z. Zhang
Rui Wang
ALM
87
1
0
16 Mar 2025
Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention
Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention
Emily Xiao
Chin-Jou Li
Yilin Zhang
Graham Neubig
Amanda Bertsch
BDL
72
0
0
11 Mar 2025
1