Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.16997
Cited By
INT-FlashAttention: Enabling Flash Attention for INT8 Quantization
25 September 2024
Shimao Chen
Zirui Liu
Zhiying Wu
Ce Zheng
Peizhuang Cong
Zihan Jiang
Yuhan Wu
Lei Su
Tong Yang
MQ
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"INT-FlashAttention: Enabling Flash Attention for INT8 Quantization"
1 / 1 papers shown
Title
QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache
Rishabh Tiwari
Haocheng Xi
Aditya Tomar
Coleman Hooper
Sehoon Kim
Maxwell Horton
Mahyar Najibi
Michael W. Mahoney
Kemal Kurniawan
Amir Gholami
MQ
61
1
0
05 Feb 2025
1