ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.05877
  4. Cited By
Quantization and Training of Neural Networks for Efficient
  Integer-Arithmetic-Only Inference

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

15 December 2017
Benoit Jacob
S. Kligys
Bo Chen
Menglong Zhu
Matthew Tang
Andrew G. Howard
Hartwig Adam
Dmitry Kalenichenko
    MQ
ArXiv (abs)PDFHTML

Papers citing "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"

50 / 1,298 papers shown
Title
RCNet: $ΔΣ$ IADCs as Recurrent AutoEncoders
RCNet: ΔΣΔΣΔΣ IADCs as Recurrent AutoEncoders
Arnaud Verdant
William Guicquero
Jérôme Chossat
15
0
0
20 Jun 2025
Efficient and Privacy-Preserving Soft Prompt Transfer for LLMs
Efficient and Privacy-Preserving Soft Prompt Transfer for LLMs
Xun Wang
Jing Xu
Franziska Boenisch
Michael Backes
Christopher A. Choquette-Choo
Adam Dziedzic
AAML
36
0
0
19 Jun 2025
Quantizing Small-Scale State-Space Models for Edge AI
Quantizing Small-Scale State-Space Models for Edge AI
Leo Zhao
Tristan Torchet
Melika Payvand
Laura Kriener
Filippo Moro
MQ
23
0
0
14 Jun 2025
Compression Aware Certified Training
Compression Aware Certified Training
Changming Xu
Gagandeep Singh
23
0
0
13 Jun 2025
The Effect of Stochasticity in Score-Based Diffusion Sampling: a KL Divergence Analysis
The Effect of Stochasticity in Score-Based Diffusion Sampling: a KL Divergence Analysis
Bernardo P. Schaeffer
Ricardo M. S. Rosa
Glauco Valle
DiffM
15
0
0
13 Jun 2025
Starting Positions Matter: A Study on Better Weight Initialization for Neural Network Quantization
Starting Positions Matter: A Study on Better Weight Initialization for Neural Network Quantization
S. Yun
A. Wong
MQ
104
0
0
12 Jun 2025
Robust Noise Attenuation via Adaptive Pooling of Transformer Outputs
Robust Noise Attenuation via Adaptive Pooling of Transformer Outputs
Greyson Brothers
ViT
27
0
0
10 Jun 2025
Real-Time Execution of Action Chunking Flow Policies
Real-Time Execution of Action Chunking Flow Policies
Kevin Black
Manuel Y. Galliker
Sergey Levine
OffRL
25
0
0
09 Jun 2025
EdgeProfiler: A Fast Profiling Framework for Lightweight LLMs on Edge Using Analytical Model
EdgeProfiler: A Fast Profiling Framework for Lightweight LLMs on Edge Using Analytical Model
Alyssa Pinnock
Shakya Jayakody
Kawsher A Roxy
Md Rubel Ahmed
36
0
0
06 Jun 2025
FPTQuant: Function-Preserving Transforms for LLM Quantization
Boris van Breugel
Yelysei Bondarenko
Paul N. Whatmough
Markus Nagel
MQ
97
0
0
05 Jun 2025
EfficientQuant: An Efficient Post-Training Quantization for CNN-Transformer Hybrid Models on Edge Devices
EfficientQuant: An Efficient Post-Training Quantization for CNN-Transformer Hybrid Models on Edge Devices
Shaibal Saha
Lanyu Xu
MQ
13
0
0
05 Jun 2025
FPGA-Enabled Machine Learning Applications in Earth Observation: A Systematic Review
FPGA-Enabled Machine Learning Applications in Earth Observation: A Systematic Review
Cédric Léonard
Dirk Stober
Martin Schulz
103
0
0
04 Jun 2025
BitTTS: Highly Compact Text-to-Speech Using 1.58-bit Quantization and Weight Indexing
BitTTS: Highly Compact Text-to-Speech Using 1.58-bit Quantization and Weight Indexing
Masaya Kawamura
Takuya Hasumi
Yuma Shirahata
Ryuichi Yamamoto
MQ
49
0
0
04 Jun 2025
Memory-Efficient FastText: A Comprehensive Approach Using Double-Array Trie Structures and Mark-Compact Memory Management
Memory-Efficient FastText: A Comprehensive Approach Using Double-Array Trie Structures and Mark-Compact Memory Management
Yimin Du
VLM
61
0
0
02 Jun 2025
Assigning Distinct Roles to Quantized and Low-Rank Matrices Toward Optimal Weight Decomposition
Assigning Distinct Roles to Quantized and Low-Rank Matrices Toward Optimal Weight Decomposition
Yoonjun Cho
Soeun Kim
Dongjae Jeon
Kyelim Lee
Beomsoo Lee
Albert No
MQ
36
0
0
02 Jun 2025
QuantFace: Low-Bit Post-Training Quantization for One-Step Diffusion Face Restoration
QuantFace: Low-Bit Post-Training Quantization for One-Step Diffusion Face Restoration
Jiatong Li
Libo Zhu
Haotong Qin
Jingkai Wang
Linghe Kong
Guihai Chen
Yulun Zhang
Xiaokang Yang
DiffMMQ
50
0
0
01 Jun 2025
Chameleon: A MatMul-Free Temporal Convolutional Network Accelerator for End-to-End Few-Shot and Continual Learning from Sequential Data
Chameleon: A MatMul-Free Temporal Convolutional Network Accelerator for End-to-End Few-Shot and Continual Learning from Sequential Data
Douwe den Blanken
Charlotte Frenkel
39
0
0
30 May 2025
LittleBit: Ultra Low-Bit Quantization via Latent Factorization
LittleBit: Ultra Low-Bit Quantization via Latent Factorization
Banseok Lee
Dongkyu Kim
Youngcheon You
Youngmin Kim
MQ
23
0
0
30 May 2025
Edge Computing for Physics-Driven AI in Computational MRI: A Feasibility Study
Edge Computing for Physics-Driven AI in Computational MRI: A Feasibility Study
Yasar Utku Alçalar
Yu Cao
Mehmet Akçakaya
27
0
0
30 May 2025
LPASS: Linear Probes as Stepping Stones for vulnerability detection using compressed LLMs
LPASS: Linear Probes as Stepping Stones for vulnerability detection using compressed LLMs
Luis Ibanez-Lissen
Lorena Gonzalez-Manzano
José Maria De Fuentes
Nicolas Anciaux
30
0
0
30 May 2025
Compressing Sine-Activated Low-Rank Adapters through Post-Training Quantization
Compressing Sine-Activated Low-Rank Adapters through Post-Training Quantization
Cameron Gordon
Yiping Ji
Hemanth Saratchandran
Paul Albert
Simon Lucey
MQ
63
0
0
28 May 2025
Progressive Data Dropout: An Embarrassingly Simple Approach to Faster Training
Progressive Data Dropout: An Embarrassingly Simple Approach to Faster Training
S. Srinivasan
Xinyue Hao
Shihao Hou
Yang Lu
Laura Sevilla-Lara
Anurag Arnab
Shreyank N Gowda
66
0
0
28 May 2025
Pioneering 4-Bit FP Quantization for Diffusion Models: Mixup-Sign Quantization and Timestep-Aware Fine-Tuning
Pioneering 4-Bit FP Quantization for Diffusion Models: Mixup-Sign Quantization and Timestep-Aware Fine-Tuning
Maosen Zhao
Pengtao Chen
Chong Yu
Yan Wen
Xudong Tan
Tao Chen
MQ
44
1
0
27 May 2025
QwT-v2: Practical, Effective and Efficient Post-Training Quantization
QwT-v2: Practical, Effective and Efficient Post-Training Quantization
Ningyuan Tang
Minghao Fu
Hao Yu
Jianxin Wu
MQ
89
0
0
27 May 2025
CA3D: Convolutional-Attentional 3D Nets for Efficient Video Activity Recognition on the Edge
CA3D: Convolutional-Attentional 3D Nets for Efficient Video Activity Recognition on the Edge
Gabriele Lagani
Fabrizio Falchi
Claudio Gennaro
Giuseppe Amato
25
0
0
26 May 2025
DVD-Quant: Data-free Video Diffusion Transformers Quantization
DVD-Quant: Data-free Video Diffusion Transformers Quantization
Zhiteng Li
Hanxuan Li
Junyi Wu
Kai Liu
Linghe Kong
Guihai Chen
Yulun Zhang
Xiaokang Yang
MQVGen
72
0
0
24 May 2025
ALPS: Attention Localization and Pruning Strategy for Efficient Alignment of Large Language Models
ALPS: Attention Localization and Pruning Strategy for Efficient Alignment of Large Language Models
Hao Chen
Haoze Li
Zhiqing Xiao
Lirong Gao
Qi Zhang
Xiaomeng Hu
Ningtao Wang
Xing Fu
Junbo Zhao
206
0
0
24 May 2025
CIM-NET: A Video Denoising Deep Neural Network Model Optimized for Computing-in-Memory Architectures
CIM-NET: A Video Denoising Deep Neural Network Model Optimized for Computing-in-Memory Architectures
Shan Gao
Zhiqiang Wu
Yawen Niu
Xiaotao Li
Qingqing Xu
21
0
0
23 May 2025
Extending Dataset Pruning to Object Detection: A Variance-based Approach
Ryota Yagi
VLM
58
0
0
22 May 2025
Saliency-Aware Quantized Imitation Learning for Efficient Robotic Control
Saliency-Aware Quantized Imitation Learning for Efficient Robotic Control
Seongmin Park
Hyungmin Kim
Sangwoo kim
Wonseok Jeon
Juyoung Yang
Byeongwook Jeon
Yoonseon Oh
Jungwook Choi
192
0
0
21 May 2025
InTreeger: An End-to-End Framework for Integer-Only Decision Tree Inference
InTreeger: An End-to-End Framework for Integer-Only Decision Tree Inference
Duncan Bart
Bruno Endres Forlin
Ana-Lucia Varbanescu
Marco Ottavi
Kuan-Hsun Chen
MQ
44
0
0
21 May 2025
Guarded Query Routing for Large Language Models
Guarded Query Routing for Large Language Models
Richard Šléher
William Brach
Tibor Sloboda
Kristián Košťál
Lukas Galke
RALM
72
0
0
20 May 2025
Rank-K: Test-Time Reasoning for Listwise Reranking
Rank-K: Test-Time Reasoning for Listwise Reranking
Eugene Yang
Andrew Yates
Kathryn Ricci
Orion Weller
Vivek Chari
Benjamin Van Durme
Dawn J Lawrie
LRM
69
2
0
20 May 2025
Quaff: Quantized Parameter-Efficient Fine-Tuning under Outlier Spatial Stability Hypothesis
Quaff: Quantized Parameter-Efficient Fine-Tuning under Outlier Spatial Stability Hypothesis
Hong Huang
Dapeng Wu
112
0
0
20 May 2025
Automatic mixed precision for optimizing gained time with constrained loss mean-squared-error based on model partition to sequential sub-graphs
Automatic mixed precision for optimizing gained time with constrained loss mean-squared-error based on model partition to sequential sub-graphs
Shmulik Markovich-Golan
Daniel Ohayon
Itay Niv
Yair Hanani
MQ
136
0
0
19 May 2025
Bridging Quantized Artificial Neural Networks and Neuromorphic Hardware
Bridging Quantized Artificial Neural Networks and Neuromorphic Hardware
Zhenhui Chen
Haoran Xu
De Ma
Xiaofei Jin
Xinyu Li
Ziyang Kang
Gang Pan
De Ma
127
0
0
18 May 2025
PMQ-VE: Progressive Multi-Frame Quantization for Video Enhancement
PMQ-VE: Progressive Multi-Frame Quantization for Video Enhancement
ZhanFeng Feng
Long Peng
Xin Di
Yong Guo
Wenbo Li
Yulun Zhang
Renjing Pei
Yang Wang
Yang Cao
Zheng-Jun Zha
MQ
146
0
0
18 May 2025
Accurate KV Cache Quantization with Outlier Tokens Tracing
Accurate KV Cache Quantization with Outlier Tokens Tracing
Yi Su
Yuechi Zhou
Quantong Qiu
Jilong Li
Qingrong Xia
Ping Li
Xinyu Duan
Zhefeng Wang
Min Zhang
MQ
79
1
0
16 May 2025
Addition is almost all you need: Compressing neural networks with double binary factorization
Addition is almost all you need: Compressing neural networks with double binary factorization
Vladimír Boža
Vladimír Macko
MQ
144
0
0
16 May 2025
Efficient Mixed Precision Quantization in Graph Neural Networks
Efficient Mixed Precision Quantization in Graph Neural Networks
Samir Moustafa
Nils M. Kriege
Wilfried Gansterer
GNNMQ
73
0
0
14 May 2025
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference
Tollef Emil Jørgensen
MQ
97
0
0
13 May 2025
Private LoRA Fine-tuning of Open-Source LLMs with Homomorphic Encryption
Private LoRA Fine-tuning of Open-Source LLMs with Homomorphic Encryption
Jordan Fréry
Roman Bredehoft
Jakub Klemsa
Arthur Meyre
Andrei Stoian
51
0
0
12 May 2025
Turning LLM Activations Quantization-Friendly
Turning LLM Activations Quantization-Friendly
Patrik Czakó
Gábor Kertész
Sándor Szénási
MQ
22
0
0
11 May 2025
Sigma-Delta Neural Network Conversion on Loihi 2
Sigma-Delta Neural Network Conversion on Loihi 2
Matthew Brehove
Sadia Anjum Tumpa
Espoir Kyubwa
Naresh Menon
Vijaykrishnan Narayanan
45
0
0
09 May 2025
Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model
Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model
Navin Ranjan
Andreas E. Savakis
MQVLM
145
0
0
08 May 2025
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Sanjay Surendranath Girija
Shashank Kapoor
Lakshit Arora
Dipen Pradhan
Aman Raj
Ankit Shetgaonkar
160
0
0
05 May 2025
Radio: Rate-Distortion Optimization for Large Language Model Compression
Radio: Rate-Distortion Optimization for Large Language Model Compression
Sean I. Young
MQ
65
0
0
05 May 2025
Enhancing AI Face Realism: Cost-Efficient Quality Improvement in Distilled Diffusion Models with a Fully Synthetic Dataset
Enhancing AI Face Realism: Cost-Efficient Quality Improvement in Distilled Diffusion Models with a Fully Synthetic Dataset
Jakub Wąsala
Bartłomiej Wrzalski
Kornelia Noculak
Yuliia Tarasenko
Oliwer Krupa
Jan Kocoń
Grzegorz Chodak
114
0
0
04 May 2025
RWKVQuant: Quantizing the RWKV Family with Proxy Guided Hybrid of Scalar and Vector Quantization
RWKVQuant: Quantizing the RWKV Family with Proxy Guided Hybrid of Scalar and Vector Quantization
Chen Xu
Yuxuan Yue
Zukang Xu
Xing Hu
Jiangyong Yu
Zhixuan Chen
Sifan Zhou
Zhihang Yuan
Dawei Yang
MQ
68
0
0
02 May 2025
Pack-PTQ: Advancing Post-training Quantization of Neural Networks by Pack-wise Reconstruction
Pack-PTQ: Advancing Post-training Quantization of Neural Networks by Pack-wise Reconstruction
Changjun Li
Runqing Jiang
Zhuo Song
Pengpeng Yu
Ye Zhang
Yulan Guo
MQ
154
0
0
01 May 2025
1234...242526
Next