
v1v2 (latest)
Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs
Papers citing "Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs"
13 / 13 papers shown
Title |
---|
![]() MoEfication: Transformer Feed-forward Layers are Mixtures of Experts Zhengyan Zhang Yankai Lin Zhiyuan Liu Peng Li Maosong Sun Jie Zhou |