Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.15463
Cited By
QUIDAM: A Framework for Quantization-Aware DNN Accelerator and Model Co-Exploration
30 June 2022
A. Inci
Siri Garudanagiri Virupaksha
Aman Jain
Ting-Wu Chin
Venkata Vivek Thallam
Ruizhou Ding
Diana Marculescu
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"QUIDAM: A Framework for Quantization-Aware DNN Accelerator and Model Co-Exploration"
32 / 32 papers shown
Title
Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization
Minsu Kim
Seongmin Hong
RyeoWook Ko
S. Choi
Hunjong Lee
Junsoo Kim
Joo-Young Kim
Jongse Park
113
0
0
24 Mar 2025
Efficient Deep Learning Using Non-Volatile Memory Technology
A. Inci
Mehmet Meric Isgenc
Diana Marculescu
98
3
0
27 Jun 2022
QADAM: Quantization-Aware DNN Accelerator Modeling for Pareto-Optimality
A. Inci
Siri Garudanagiri Virupaksha
Aman Jain
Venkata Vivek Thallam
Ruizhou Ding
Diana Marculescu
MQ
60
2
0
20 May 2022
QAPPA: Quantization-Aware Power, Performance, and Area Modeling of DNN Accelerators
A. Inci
Siri Garudanagiri Virupaksha
Aman Jain
Venkata Vivek Thallam
Ruizhou Ding
Diana Marculescu
MQ
54
5
0
17 May 2022
Rethinking Co-design of Neural Architectures and Hardware Accelerators
Yanqi Zhou
Xuanyi Dong
Berkin Akin
Mingxing Tan
Daiyi Peng
Tianjian Meng
Amir Yazdanbakhsh
Da Huang
Ravi Narayanaswami
James Laudon
141
26
0
17 Feb 2021
DeepNVM++: Cross-Layer Modeling and Optimization Framework of Non-Volatile Memories for Deep Learning
A. Inci
Mehmet Meric Isgenc
Diana Marculescu
70
20
0
08 Dec 2020
The Architectural Implications of Distributed Reinforcement Learning on CPU-GPU Systems
A. Inci
Evgeny Bolotin
Yaosheng Fu
Gal Dalal
Shie Mannor
D. Nellans
Diana Marculescu
AI4CE
48
13
0
08 Dec 2020
Accelerator-aware Neural Network Design using AutoML
Suyog Gupta
Berkin Akin
84
66
0
05 Mar 2020
Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks
Lei Yang
Zheyu Yan
Meng Li
Hyoukjun Kwon
Liangzhen Lai
T. Krishna
Vikas Chandra
Weiwen Jiang
Yiyu Shi
80
116
0
10 Feb 2020
Gemmini: Enabling Systematic Deep-Learning Architecture Evaluation via Full-Stack Integration
Hasan Genç
Seah Kim
Alon Amid
Ameer Haj-Ali
Vighnesh Iyer
...
Ion Stoica
Jonathan Ragan-Kelley
Krste Asanović
B. Nikolić
Y. Shao
85
230
0
22 Nov 2019
EfficientDet: Scalable and Efficient Object Detection
Mingxing Tan
Ruoming Pang
Quoc V. Le
120
5,076
0
20 Nov 2019
Additive Powers-of-Two Quantization: An Efficient Non-uniform Discretization for Neural Networks
Yuhang Li
Xin Dong
Wei Wang
MQ
66
259
0
28 Sep 2019
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Mingxing Tan
Quoc V. Le
3DV
MedIm
172
18,224
0
28 May 2019
Towards Efficient Model Compression via Learned Global Ranking
Ting-Wu Chin
Ruizhou Ding
Cha Zhang
Diana Marculescu
79
171
0
28 Apr 2019
Single Path One-Shot Neural Architecture Search with Uniform Sampling
Zichao Guo
Xiangyu Zhang
Haoyuan Mu
Wen Heng
Zechun Liu
Yichen Wei
Jian Sun
106
941
0
31 Mar 2019
Learned Step Size Quantization
S. K. Esser
J. McKinstry
Deepika Bablani
R. Appuswamy
D. Modha
MQ
75
810
0
21 Feb 2019
Random Search and Reproducibility for Neural Architecture Search
Liam Li
Ameet Talwalkar
OOD
94
726
0
20 Feb 2019
HAQ: Hardware-Aware Automated Quantization with Mixed Precision
Kuan-Chieh Wang
Zhijian Liu
Chengyue Wu
Ji Lin
Song Han
MQ
129
884
0
21 Nov 2018
SCALE-Sim: Systolic CNN Accelerator Simulator
A. Samajdar
Yuhao Zhu
P. Whatmough
Matthew Mattina
Tushar Krishna
104
137
0
16 Oct 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,324
0
11 Oct 2018
Hardware-Aware Machine Learning: Modeling and Optimization
Diana Marculescu
Dimitrios Stamoulis
E. Cai
57
45
0
14 Sep 2018
Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks
Yang He
Guoliang Kang
Xuanyi Dong
Yanwei Fu
Yi Yang
AAML
VLM
74
965
0
21 Aug 2018
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
Benoit Jacob
S. Kligys
Bo Chen
Menglong Zhu
Matthew Tang
Andrew G. Howard
Hartwig Adam
Dmitry Kalenichenko
MQ
167
3,148
0
15 Dec 2017
LightNN: Filling the Gap between Conventional Deep Neural Networks and Binarized Networks
Ruizhou Ding
Z. Liu
Rongye Shi
Diana Marculescu
R. D. Blanton
MQ
56
37
0
02 Dec 2017
NeuralPower: Predict and Deploy Energy-Efficient Convolutional Neural Networks
E. Cai
Da-Cheng Juan
Dimitrios Stamoulis
Diana Marculescu
51
132
0
15 Oct 2017
SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks
A. Parashar
Minsoo Rhu
Anurag Mukkara
A. Puglielli
Rangharajan Venkatesan
Brucek Khailany
J. Emer
S. Keckler
W. Dally
81
1,131
0
23 May 2017
In-Datacenter Performance Analysis of a Tensor Processing Unit
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
...
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
239
4,644
0
16 Apr 2017
DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients
Shuchang Zhou
Yuxin Wu
Zekun Ni
Xinyu Zhou
He Wen
Yuheng Zou
MQ
135
2,090
0
20 Jun 2016
EIE: Efficient Inference Engine on Compressed Deep Neural Network
Song Han
Xingyu Liu
Huizi Mao
Jing Pu
A. Pedram
M. Horowitz
W. Dally
132
2,461
0
04 Feb 2016
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.3K
194,641
0
10 Dec 2015
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Song Han
Huizi Mao
W. Dally
3DGS
263
8,864
0
01 Oct 2015
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAtt
MDE
1.7K
100,575
0
04 Sep 2014
1