Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.14110
Cited By
Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators
25 January 2024
Yaniv Blumenfeld
Itay Hubara
Daniel Soudry
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators"
5 / 5 papers shown
Title
Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACs
Qizhe Wu
Huawen Liang
Yuchen Gui
Zhichen Zeng
Z. He
...
Letian Zhao
Zhaoxi Zeng
W. Yuan
Wei Wu
Xi Jin
44
0
0
08 Mar 2025
Accumulator-Aware Post-Training Quantization
Ian Colbert
Fabian Grob
Giuseppe Franco
Jinjie Zhang
Rayan Saab
MQ
30
3
0
25 Sep 2024
A2Q+: Improving Accumulator-Aware Weight Quantization
Ian Colbert
Alessandro Pappalardo
Jakoba Petri-Koenig
Yaman Umuroglu
MQ
29
4
0
19 Jan 2024
Overcoming Oscillations in Quantization-Aware Training
Markus Nagel
Marios Fournarakis
Yelysei Bondarenko
Tijmen Blankevoort
MQ
111
101
0
21 Mar 2022
Pruning and Quantization for Deep Neural Network Acceleration: A Survey
Tailin Liang
C. Glossner
Lei Wang
Shaobo Shi
Xiaotong Zhang
MQ
150
674
0
24 Jan 2021
1