ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2412.20677
  4. Cited By
Align Attention Heads Before Merging Them: An Effective Way for Converting MHA to GQA

Align Attention Heads Before Merging Them: An Effective Way for Converting MHA to GQA

31 December 2024
Qingyun Jin
Xiaohui Song
Feng Zhou
Zengchang Qin
ArXiv (abs)PDFHTML

Papers citing "Align Attention Heads Before Merging Them: An Effective Way for Converting MHA to GQA"

2 / 2 papers shown
Title
On Pruning State-Space LLMs
On Pruning State-Space LLMs
Tamer Ghattas
Michael Hassid
Roy Schwartz
85
2
0
26 Feb 2025
SVDq: 1.25-bit and 410x Key Cache Compression for LLM Attention
SVDq: 1.25-bit and 410x Key Cache Compression for LLM Attention
Hong Yankun
Li Xing
Zhen Hui-Ling
Yu Xianzhi
Liu Wulong
Yuan Mingxuan
MQ
117
0
0
24 Feb 2025
1