ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.05668
  4. Cited By
Model compression via distillation and quantization

Model compression via distillation and quantization

15 February 2018
A. Polino
Razvan Pascanu
Dan Alistarh
    MQ
ArXivPDFHTML

Papers citing "Model compression via distillation and quantization"

50 / 171 papers shown
Title
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Sanjay Surendranath Girija
Shashank Kapoor
Lakshit Arora
Dipen Pradhan
Aman Raj
Ankit Shetgaonkar
57
0
0
05 May 2025
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Jinlong Li
Cristiano Saltori
Fabio Poiesi
N. Sebe
270
0
0
20 Mar 2025
Towards Understanding Distilled Reasoning Models: A Representational Approach
Towards Understanding Distilled Reasoning Models: A Representational Approach
David D. Baek
Max Tegmark
LRM
80
3
0
05 Mar 2025
Mixture of Attentions For Speculative Decoding
Mixture of Attentions For Speculative Decoding
Matthieu Zimmer
Milan Gritta
Gerasimos Lampouras
Haitham Bou Ammar
Jun Wang
76
4
0
04 Oct 2024
InfantCryNet: A Data-driven Framework for Intelligent Analysis of Infant Cries
InfantCryNet: A Data-driven Framework for Intelligent Analysis of Infant Cries
Mengze Hong
Chen Jason Zhang
Lingxiao Yang
Yuanfeng Song
Di Jiang
44
2
0
29 Sep 2024
Online-Score-Aided Federated Learning: Taming the Resource Constraints in Wireless Networks
Online-Score-Aided Federated Learning: Taming the Resource Constraints in Wireless Networks
Md Ferdous Pervej
Minseok Choi
A. Molisch
38
0
0
12 Aug 2024
Training Foundation Models as Data Compression: On Information, Model Weights and Copyright Law
Training Foundation Models as Data Compression: On Information, Model Weights and Copyright Law
Giorgio Franceschelli
Claudia Cevenini
Mirco Musolesi
51
0
0
18 Jul 2024
Relational Representation Distillation
Relational Representation Distillation
Nikolaos Giakoumoglou
Tania Stathaki
45
0
0
16 Jul 2024
HDKD: Hybrid Data-Efficient Knowledge Distillation Network for Medical Image Classification
HDKD: Hybrid Data-Efficient Knowledge Distillation Network for Medical Image Classification
Omar S. El-Assiouti
Ghada Hamed
Dina Khattab
H. M. Ebied
52
1
0
10 Jul 2024
Timestep-Aware Correction for Quantized Diffusion Models
Timestep-Aware Correction for Quantized Diffusion Models
Yuzhe Yao
Feng Tian
Jun Chen
Haonan Lin
Guang Dai
Yong Liu
Jingdong Wang
DiffM
MQ
50
5
0
04 Jul 2024
LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing
LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing
Hongxiang Zhang
Yuyang Rong
Yifeng He
Hao Chen
38
7
0
11 Jun 2024
Adaptive quantization with mixed-precision based on low-cost proxy
Adaptive quantization with mixed-precision based on low-cost proxy
Jing Chen
Qiao Yang
Senmao Tian
Shunli Zhang
MQ
35
1
0
27 Feb 2024
Fast Vocabulary Transfer for Language Model Compression
Fast Vocabulary Transfer for Language Model Compression
Leonidas Gee
Andrea Zugarini
Leonardo Rigutini
Paolo Torroni
37
27
0
15 Feb 2024
SwapNet: Efficient Swapping for DNN Inference on Edge AI Devices Beyond
  the Memory Budget
SwapNet: Efficient Swapping for DNN Inference on Edge AI Devices Beyond the Memory Budget
Kun Wang
Jiani Cao
Zimu Zhou
Zhenjiang Li
35
6
0
30 Jan 2024
RL-MPCA: A Reinforcement Learning Based Multi-Phase Computation
  Allocation Approach for Recommender Systems
RL-MPCA: A Reinforcement Learning Based Multi-Phase Computation Allocation Approach for Recommender Systems
Jiahong Zhou
Shunhui Mao
Guoliang Yang
Bo Tang
Qianlong Xie
Lebin Lin
Xingxing Wang
Dong Wang
37
8
0
27 Dec 2023
Pursing the Sparse Limitation of Spiking Deep Learning Structures
Pursing the Sparse Limitation of Spiking Deep Learning Structures
Hao-Ran Cheng
Jiahang Cao
Erjia Xiao
Mengshu Sun
Le Yang
Jize Zhang
Xue Lin
B. Kailkhura
Kaidi Xu
Renjing Xu
29
1
0
18 Nov 2023
RepQ: Generalizing Quantization-Aware Training for Re-Parametrized
  Architectures
RepQ: Generalizing Quantization-Aware Training for Re-Parametrized Architectures
Anastasiia Prutianova
Alexey Zaytsev
Chung-Kuei Lee
Fengyu Sun
Ivan Koryakovskiy
MQ
28
0
0
09 Nov 2023
The Road to On-board Change Detection: A Lightweight Patch-Level Change
  Detection Network via Exploring the Potential of Pruning and Pooling
The Road to On-board Change Detection: A Lightweight Patch-Level Change Detection Network via Exploring the Potential of Pruning and Pooling
Lihui Xue
Zhihao Wang
Xueqian Wang
Gang Li
50
1
0
16 Oct 2023
Soft Quantization using Entropic Regularization
Soft Quantization using Entropic Regularization
Rajmadan Lakshmanan
Alois Pichler
MQ
13
5
0
08 Sep 2023
eDKM: An Efficient and Accurate Train-time Weight Clustering for Large
  Language Models
eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models
Minsik Cho
Keivan Alizadeh Vahid
Qichen Fu
Saurabh N. Adya
C. C. D. Mundo
Mohammad Rastegari
Devang Naik
Peter Zatloukal
MQ
31
6
0
02 Sep 2023
Quantized Feature Distillation for Network Quantization
Quantized Feature Distillation for Network Quantization
Kevin Zhu
Yin He
Jianxin Wu
MQ
31
9
0
20 Jul 2023
Self-Distilled Quantization: Achieving High Compression Rates in
  Transformer-Based Language Models
Self-Distilled Quantization: Achieving High Compression Rates in Transformer-Based Language Models
James OÑeill
Sourav Dutta
VLM
MQ
47
1
0
12 Jul 2023
InfLoR-SNN: Reducing Information Loss for Spiking Neural Networks
InfLoR-SNN: Reducing Information Loss for Spiking Neural Networks
Yu-Zhu Guo
Y. Chen
Liwen Zhang
Xiaode Liu
Xinyi Tong
Yuanyuan Ou
Xuhui Huang
Zhe Ma
AAML
46
3
0
10 Jul 2023
A Review on Explainable Artificial Intelligence for Healthcare: Why,
  How, and When?
A Review on Explainable Artificial Intelligence for Healthcare: Why, How, and When?
M. Rubaiyat
Hossain Mondal
Prajoy Podder
31
57
0
10 Apr 2023
Performance-aware Approximation of Global Channel Pruning for Multitask
  CNNs
Performance-aware Approximation of Global Channel Pruning for Multitask CNNs
Hancheng Ye
Bo Zhang
Tao Chen
Jiayuan Fan
Bin Wang
42
18
0
21 Mar 2023
Students Parrot Their Teachers: Membership Inference on Model
  Distillation
Students Parrot Their Teachers: Membership Inference on Model Distillation
Matthew Jagielski
Milad Nasr
Christopher A. Choquette-Choo
Katherine Lee
Nicholas Carlini
FedML
46
21
0
06 Mar 2023
LightTS: Lightweight Time Series Classification with Adaptive Ensemble
  Distillation -- Extended Version
LightTS: Lightweight Time Series Classification with Adaptive Ensemble Distillation -- Extended Version
David Campos
Miao Zhang
B. Yang
Tung Kieu
Chenjuan Guo
Christian S. Jensen
AI4TS
47
47
0
24 Feb 2023
Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal
  Supervision
Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision
Eugene Kharitonov
Damien Vincent
Zalan Borsos
Raphaël Marinier
Sertan Girgin
Olivier Pietquin
Matthew Sharifi
Marco Tagliasacchi
Neil Zeghidour
24
193
0
07 Feb 2023
RedBit: An End-to-End Flexible Framework for Evaluating the Accuracy of
  Quantized CNNs
RedBit: An End-to-End Flexible Framework for Evaluating the Accuracy of Quantized CNNs
A. M. Ribeiro-dos-Santos
João Dinis Ferreira
O. Mutlu
G. Falcão
MQ
23
1
0
15 Jan 2023
Pruning Compact ConvNets for Efficient Inference
Pruning Compact ConvNets for Efficient Inference
Sayan Ghosh
Karthik Prasad
Xiaoliang Dai
Peizhao Zhang
Bichen Wu
Graham Cormode
Peter Vajda
VLM
34
4
0
11 Jan 2023
Systems for Parallel and Distributed Large-Model Deep Learning Training
Systems for Parallel and Distributed Large-Model Deep Learning Training
Kabir Nagrecha
GNN
VLM
MoE
36
7
0
06 Jan 2023
PD-Quant: Post-Training Quantization based on Prediction Difference
  Metric
PD-Quant: Post-Training Quantization based on Prediction Difference Metric
Jiawei Liu
Lin Niu
Zhihang Yuan
Dawei Yang
Xinggang Wang
Wenyu Liu
MQ
103
70
0
14 Dec 2022
CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level
  Continuous Sparsification
CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level Continuous Sparsification
Lirui Xiao
Huanrui Yang
Zhen Dong
Kurt Keutzer
Li Du
Shanghang Zhang
MQ
29
10
0
06 Dec 2022
QFT: Post-training quantization via fast joint finetuning of all degrees
  of freedom
QFT: Post-training quantization via fast joint finetuning of all degrees of freedom
Alexander Finkelstein
Ella Fuchs
Idan Tal
Mark Grobman
Niv Vosco
Eldad Meller
MQ
41
6
0
05 Dec 2022
NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization
  for Vision Transformers
NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers
Yijiang Liu
Huanrui Yang
Zhen Dong
Kurt Keutzer
Li Du
Shanghang Zhang
MQ
33
47
0
29 Nov 2022
AskewSGD : An Annealed interval-constrained Optimisation method to train
  Quantized Neural Networks
AskewSGD : An Annealed interval-constrained Optimisation method to train Quantized Neural Networks
Louis Leconte
S. Schechtman
Eric Moulines
34
4
0
07 Nov 2022
Collaborative Multi-Teacher Knowledge Distillation for Learning Low
  Bit-width Deep Neural Networks
Collaborative Multi-Teacher Knowledge Distillation for Learning Low Bit-width Deep Neural Networks
Cuong Pham
Tuan Hoang
Thanh-Toan Do
FedML
MQ
45
14
0
27 Oct 2022
Fast and Low-Memory Deep Neural Networks Using Binary Matrix
  Factorization
Fast and Low-Memory Deep Neural Networks Using Binary Matrix Factorization
Alireza Bordbar
M. Kahaei
MQ
33
0
0
24 Oct 2022
Towards Global Neural Network Abstractions with Locally-Exact
  Reconstruction
Towards Global Neural Network Abstractions with Locally-Exact Reconstruction
Edoardo Manino
I. Bessa
Lucas C. Cordeiro
26
1
0
21 Oct 2022
HQNAS: Auto CNN deployment framework for joint quantization and
  architecture search
HQNAS: Auto CNN deployment framework for joint quantization and architecture search
Hongjiang Chen
Yang Wang
Leibo Liu
Shaojun Wei
Shouyi Yin
MQ
16
2
0
16 Oct 2022
Meta-Ensemble Parameter Learning
Meta-Ensemble Parameter Learning
Zhengcong Fei
Shuman Tian
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
OOD
46
2
0
05 Oct 2022
Improving the Performance of DNN-based Software Services using Automated
  Layer Caching
Improving the Performance of DNN-based Software Services using Automated Layer Caching
M. Abedi
Yanni Iouannou
Pooyan Jamshidi
Hadi Hemmati
28
0
0
18 Sep 2022
Mixed-Precision Neural Networks: A Survey
Mixed-Precision Neural Networks: A Survey
M. Rakka
M. Fouda
Pramod P. Khargonekar
Fadi J. Kurdahi
MQ
30
11
0
11 Aug 2022
Distributed Training for Deep Learning Models On An Edge Computing
  Network Using ShieldedReinforcement Learning
Distributed Training for Deep Learning Models On An Edge Computing Network Using ShieldedReinforcement Learning
Tanmoy Sen
Haiying Shen
OffRL
15
5
0
01 Jun 2022
Dataset Distillation using Neural Feature Regression
Dataset Distillation using Neural Feature Regression
Yongchao Zhou
E. Nezhadarya
Jimmy Ba
DD
FedML
58
153
0
01 Jun 2022
Target Aware Network Architecture Search and Compression for Efficient
  Knowledge Transfer
Target Aware Network Architecture Search and Compression for Efficient Knowledge Transfer
S. H. Shabbeer Basha
Debapriya Tula
Sravan Kumar Vinakota
S. Dubey
31
3
0
12 May 2022
Serving and Optimizing Machine Learning Workflows on Heterogeneous
  Infrastructures
Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures
Yongji Wu
Matthew Lentz
Danyang Zhuo
Yao Lu
34
22
0
10 May 2022
Compact Model Training by Low-Rank Projection with Energy Transfer
Compact Model Training by Low-Rank Projection with Energy Transfer
K. Guo
Zhenquan Lin
Xiaofen Xing
Fang Liu
Xiangmin Xu
40
2
0
12 Apr 2022
EfficientFi: Towards Large-Scale Lightweight WiFi Sensing via CSI
  Compression
EfficientFi: Towards Large-Scale Lightweight WiFi Sensing via CSI Compression
Jianfei Yang
Xinyan Chen
Han Zou
Dazhuo Wang
Q. Xu
Lihua Xie
19
78
0
08 Apr 2022
Bimodal Distributed Binarized Neural Networks
Bimodal Distributed Binarized Neural Networks
T. Rozen
Moshe Kimhi
Brian Chmiel
A. Mendelson
Chaim Baskin
MQ
72
4
0
05 Apr 2022
1234
Next