ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.16997
  4. Cited By
INT-FlashAttention: Enabling Flash Attention for INT8 Quantization

INT-FlashAttention: Enabling Flash Attention for INT8 Quantization

25 September 2024
Shimao Chen
Zirui Liu
Zhiying Wu
Ce Zheng
Peizhuang Cong
Zihan Jiang
Yuhan Wu
Lei Su
Tong Yang
    MQ
    VLM
ArXivPDFHTML

Papers citing "INT-FlashAttention: Enabling Flash Attention for INT8 Quantization"

1 / 1 papers shown
Title
QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache
QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache
Rishabh Tiwari
Haocheng Xi
Aditya Tomar
Coleman Hooper
Sehoon Kim
Maxwell Horton
Mahyar Najibi
Michael W. Mahoney
Kemal Kurniawan
Amir Gholami
MQ
58
1
0
05 Feb 2025
1