Efficient LLM Training and Serving with Heterogeneous Context Sharding among Attention Heads

25 July 2024

Xia Song

Papers citing "Efficient LLM Training and Serving with Heterogeneous Context Sharding among Attention Heads"

2 / 2 papers shown

Title
SnapKV: LLM Knows What You are Looking for Before Generation Yuhong Li Yingbing Huang Bowen Yang Bharat Venkitesh Acyr Locatelli Hanchen Ye Tianle Cai Patrick Lewis Deming Chen VLM 79 160 0 22 Apr 2024
In-context Learning and Induction Heads Catherine Olsson Nelson Elhage Neel Nanda Nicholas Joseph Nova Dassarma ... Tom B. Brown Jack Clark Jared Kaplan Sam McCandlish C. Olah 252 474 0 24 Sep 2022