Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.12397
Cited By
Mamba-PTQ: Outlier Channels in Recurrent Large Language Models
17 July 2024
Alessandro Pierro
Steven Abreu
MQ
Mamba
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Mamba-PTQ: Outlier Channels in Recurrent Large Language Models"
8 / 8 papers shown
Title
FlattenQuant: Breaking Through the Inference Compute-bound for Large Language Models with Per-tensor Quantization
Yi Zhang
Fei Yang
Shuang Peng
Fangyu Wang
Aimin Pan
MQ
69
2
0
28 Feb 2024
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
235
607
0
22 May 2023
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Tim Dettmers
M. Lewis
Younes Belkada
Luke Zettlemoyer
MQ
103
662
0
15 Aug 2022
Do Long-Range Language Models Actually Use Long-Range Context?
Simeng Sun
Kalpesh Krishna
Andrew Mattarella-Micke
Mohit Iyyer
RALM
72
84
0
19 Sep 2021
Hardware Aware Training for Efficient Keyword Spotting on General Purpose and Specialized Hardware
Peter Blouw
G. Malik
Benjamin Morcos
Aaron R. Voelker
C. Eliasmith
55
21
0
09 Sep 2020
HiPPO: Recurrent Memory with Optimal Polynomial Projections
Albert Gu
Tri Dao
Stefano Ermon
Atri Rudra
Christopher Ré
123
533
0
17 Aug 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,196
0
20 Apr 2018
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
Benoit Jacob
S. Kligys
Bo Chen
Menglong Zhu
Matthew Tang
Andrew G. Howard
Hartwig Adam
Dmitry Kalenichenko
MQ
164
3,141
0
15 Dec 2017
1