Retaining Key Information under High Compression Ratios: Query-Guided
  Compressor for LLMs
v1v2 (latest)

Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs

Papers citing "Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs"

13 / 13 papers shown
Title
MoEfication: Transformer Feed-forward Layers are Mixtures of Experts
MoEfication: Transformer Feed-forward Layers are Mixtures of Experts
Zhengyan Zhang
Yankai Lin
Zhiyuan Liu
Peng Li
Maosong Sun
Jie Zhou
96
128
0
05 Oct 2021

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.