ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.12397
  4. Cited By
Mamba-PTQ: Outlier Channels in Recurrent Large Language Models

Mamba-PTQ: Outlier Channels in Recurrent Large Language Models

17 July 2024
Alessandro Pierro
Steven Abreu
    MQMamba
ArXiv (abs)PDFHTML

Papers citing "Mamba-PTQ: Outlier Channels in Recurrent Large Language Models"

8 / 8 papers shown
Title
FlattenQuant: Breaking Through the Inference Compute-bound for Large
  Language Models with Per-tensor Quantization
FlattenQuant: Breaking Through the Inference Compute-bound for Large Language Models with Per-tensor Quantization
Yi Zhang
Fei Yang
Shuang Peng
Fangyu Wang
Aimin Pan
MQ
69
2
0
28 Feb 2024
RWKV: Reinventing RNNs for the Transformer Era
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
235
607
0
22 May 2023
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Tim Dettmers
M. Lewis
Younes Belkada
Luke Zettlemoyer
MQ
103
662
0
15 Aug 2022
Do Long-Range Language Models Actually Use Long-Range Context?
Do Long-Range Language Models Actually Use Long-Range Context?
Simeng Sun
Kalpesh Krishna
Andrew Mattarella-Micke
Mohit Iyyer
RALM
72
84
0
19 Sep 2021
Hardware Aware Training for Efficient Keyword Spotting on General
  Purpose and Specialized Hardware
Hardware Aware Training for Efficient Keyword Spotting on General Purpose and Specialized Hardware
Peter Blouw
G. Malik
Benjamin Morcos
Aaron R. Voelker
C. Eliasmith
55
21
0
09 Sep 2020
HiPPO: Recurrent Memory with Optimal Polynomial Projections
HiPPO: Recurrent Memory with Optimal Polynomial Projections
Albert Gu
Tri Dao
Stefano Ermon
Atri Rudra
Christopher Ré
123
533
0
17 Aug 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,196
0
20 Apr 2018
Quantization and Training of Neural Networks for Efficient
  Integer-Arithmetic-Only Inference
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
Benoit Jacob
S. Kligys
Bo Chen
Menglong Zhu
Matthew Tang
Andrew G. Howard
Hartwig Adam
Dmitry Kalenichenko
MQ
164
3,141
0
15 Dec 2017
1