ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.11046
  4. Cited By
Scalable Methods for 8-bit Training of Neural Networks

Scalable Methods for 8-bit Training of Neural Networks

25 May 2018
Ron Banner
Itay Hubara
Elad Hoffer
Daniel Soudry
    MQ
ArXivPDFHTML

Papers citing "Scalable Methods for 8-bit Training of Neural Networks"

50 / 168 papers shown
Title
Silenzio: Secure Non-Interactive Outsourced MLP Training
Silenzio: Secure Non-Interactive Outsourced MLP Training
Jonas Sander
T. Eisenbarth
33
0
0
24 Apr 2025
Binarized Mamba-Transformer for Lightweight Quad Bayer HybridEVS Demosaicing
Binarized Mamba-Transformer for Lightweight Quad Bayer HybridEVS Demosaicing
Shiyang Zhou
Haijin Zeng
Yunfan Lu
Tong Shao
Ke Tang
Yongyong Chen
Jie Liu
Jingyong Su
Mamba
65
0
0
20 Mar 2025
Accurate INT8 Training Through Dynamic Block-Level Fallback
Pengle Zhang
Jia wei
Jintao Zhang
Jun-Jie Zhu
Jianfei Chen
MQ
82
3
0
13 Mar 2025
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning
Sifan Zhou
Shuo Wang
Zhihang Yuan
Mingjia Shi
Yuzhang Shang
Dawei Yang
ALM
MQ
90
0
0
18 Feb 2025
QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models
QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models
Jiajun Zhou
Yifan Yang
Kai Zhen
Zhengwu Liu
Yequan Zhao
Ershad Banijamali
Athanasios Mouchtaris
Ngai Wong
Zheng Zhang
MQ
41
0
0
17 Feb 2025
Membership Inference Risks in Quantized Models: A Theoretical and Empirical Study
Eric Aubinais
Philippe Formont
Pablo Piantanida
Elisabeth Gassiat
50
0
0
10 Feb 2025
Optimizing Large Language Model Training Using FP4 Quantization
Optimizing Large Language Model Training Using FP4 Quantization
Ruizhe Wang
Yeyun Gong
Xiao Liu
Guoshuai Zhao
Ziyue Yang
Baining Guo
Zhengjun Zha
Peng Cheng
MQ
67
5
0
28 Jan 2025
HyperCam: Low-Power Onboard Computer Vision for IoT Cameras
HyperCam: Low-Power Onboard Computer Vision for IoT Cameras
Chae Young Lee
Maxwell Fite
Tejus Rao
Sara Achour
Zerina Kapetanovic
VLM
38
1
0
17 Jan 2025
Rethinking Post-Training Quantization: Introducing a Statistical Pre-Calibration Approach
Rethinking Post-Training Quantization: Introducing a Statistical Pre-Calibration Approach
Alireza Ghaffari
Sharareh Younesian
Boxing Chen
Vahid Partovi Nia
M. Asgharian
MQ
63
0
0
17 Jan 2025
Fast and Slow Gradient Approximation for Binary Neural Network
  Optimization
Fast and Slow Gradient Approximation for Binary Neural Network Optimization
Xinquan Chen
Junqi Gao
Biqing Qi
Dong Li
Yiang Luo
Fangyuan Li
Pengfei Li
MQ
80
0
0
16 Dec 2024
Towards Accurate and Efficient Sub-8-Bit Integer Training
Wenjin Guo
Donglai Liu
Weiying Xie
Yunsong Li
Xuefei Ning
Zihan Meng
Shulin Zeng
Jie Lei
Zhenman Fang
Yu Wang
MQ
39
1
0
17 Nov 2024
Lossless KV Cache Compression to 2%
Lossless KV Cache Compression to 2%
Zhen Yang
Jizong Han
Kan Wu
Ruobing Xie
An Wang
Xingchen Sun
Zhanhui Kang
VLM
MQ
36
2
0
20 Oct 2024
Error Diffusion: Post Training Quantization with Block-Scaled Number
  Formats for Neural Networks
Error Diffusion: Post Training Quantization with Block-Scaled Number Formats for Neural Networks
Alireza Khodamoradi
K. Denolf
Eric Dellinger
MQ
39
0
0
15 Oct 2024
Differentiable Weightless Neural Networks
Differentiable Weightless Neural Networks
Alan T. L. Bacellar
Zachary Susskind
Mauricio Breternitz Jr.
E. John
L. John
P. Lima
F. M. G. França
32
3
0
14 Oct 2024
Compressing VAE-Based Out-of-Distribution Detectors for Embedded
  Deployment
Compressing VAE-Based Out-of-Distribution Detectors for Embedded Deployment
Aditya Bansal
Michael Yuhas
Arvind Easwaran
OODD
26
0
0
02 Sep 2024
1-Bit FQT: Pushing the Limit of Fully Quantized Training to 1-bit
1-Bit FQT: Pushing the Limit of Fully Quantized Training to 1-bit
Chang Gao
Jianfei Chen
Kang Zhao
Jiaqi Wang
Liping Jing
MQ
41
2
0
26 Aug 2024
Robust Iterative Value Conversion: Deep Reinforcement Learning for
  Neurochip-driven Edge Robots
Robust Iterative Value Conversion: Deep Reinforcement Learning for Neurochip-driven Edge Robots
Y. Kadokawa
Tomohito Kodera
Yoshihisa Tsurumine
Shinya Nishimura
Takamitsu Matsubara
37
1
0
23 Aug 2024
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
Yifei Gao
Jie Ou
Lei Wang
Fanhua Shang
Jaji Wu
MQ
63
0
0
22 Jul 2024
NITRO-D: Native Integer-only Training of Deep Convolutional Neural
  Networks
NITRO-D: Native Integer-only Training of Deep Convolutional Neural Networks
Alberto Pirillo
Luca Colombo
Manuel Roveri
MQ
37
0
0
16 Jul 2024
VcLLM: Video Codecs are Secretly Tensor Codecs
VcLLM: Video Codecs are Secretly Tensor Codecs
Ceyu Xu
Yongji Wu
Xinyu Yang
Beidi Chen
Matthew Lentz
Danyang Zhuo
Lisa Wu Wills
50
0
0
29 Jun 2024
Compensate Quantization Errors: Make Weights Hierarchical to Compensate
  Each Other
Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other
Yifei Gao
Jie Ou
Lei Wang
Yuting Xiao
Zhiyuan Xiang
Ruiting Dai
Jun Cheng
MQ
36
3
0
24 Jun 2024
Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing
  Backpropagation
Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation
Yuchen Yang
Yingdong Shi
Cheems Wang
Xiantong Zhen
Yuxuan Shi
Jun Xu
40
1
0
24 Jun 2024
NaviSplit: Dynamic Multi-Branch Split DNNs for Efficient Distributed
  Autonomous Navigation
NaviSplit: Dynamic Multi-Branch Split DNNs for Efficient Distributed Autonomous Navigation
Timothy K Johnsen
Ian Harshbarger
Zixia Xia
Marco Levorato
30
1
0
18 Jun 2024
Center-Sensitive Kernel Optimization for Efficient On-Device Incremental
  Learning
Center-Sensitive Kernel Optimization for Efficient On-Device Incremental Learning
Dingwen Zhang
Yan Li
De Cheng
N. Wang
J. Han
CLL
36
0
0
13 Jun 2024
LoQT: Low Rank Adapters for Quantized Training
LoQT: Low Rank Adapters for Quantized Training
Sebastian Loeschcke
M. Toftrup
M. Kastoryano
Serge Belongie
Vésteinn Snæbjarnarson
MQ
42
0
0
26 May 2024
AdpQ: A Zero-shot Calibration Free Adaptive Post Training Quantization
  Method for LLMs
AdpQ: A Zero-shot Calibration Free Adaptive Post Training Quantization Method for LLMs
Alireza Ghaffari
Sharareh Younesian
Vahid Partovi Nia
Boxing Chen
M. Asgharian
MQ
55
0
0
22 May 2024
Acceleration Algorithms in GNNs: A Survey
Acceleration Algorithms in GNNs: A Survey
Lu Ma
Zeang Sheng
Xunkai Li
Xin Gao
Zhezheng Hao
Ling Yang
Wentao Zhang
Bin Cui
GNN
42
3
0
07 May 2024
Collage: Light-Weight Low-Precision Strategy for LLM Training
Collage: Light-Weight Low-Precision Strategy for LLM Training
Tao Yu
Gaurav Gupta
Karthick Gopalswamy
Amith R. Mamidala
Hao Zhou
Jeffrey Huynh
Youngsuk Park
Ron Diamant
Anoop Deoras
Jun Huan
MQ
59
3
0
06 May 2024
Communication-Efficient Large-Scale Distributed Deep Learning: A
  Comprehensive Survey
Communication-Efficient Large-Scale Distributed Deep Learning: A Comprehensive Survey
Feng Liang
Zhen Zhang
Haifeng Lu
Victor C. M. Leung
Yanyi Guo
Xiping Hu
GNN
37
6
0
09 Apr 2024
Lightweight Deep Learning for Resource-Constrained Environments: A
  Survey
Lightweight Deep Learning for Resource-Constrained Environments: A Survey
Hou-I Liu
Marco Galindo
Hongxia Xie
Lai-Kuan Wong
Hong-Han Shuai
Yung-Hui Li
Wen-Huang Cheng
58
48
0
08 Apr 2024
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data
  Flow and Per-Block Quantization
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization
Haocheng Xi
Yuxiang Chen
Kang Zhao
Kaijun Zheng
Jianfei Chen
Jun Zhu
MQ
45
21
0
19 Mar 2024
Better Schedules for Low Precision Training of Deep Neural Networks
Better Schedules for Low Precision Training of Deep Neural Networks
Cameron R. Wolfe
Anastasios Kyrillidis
47
1
0
04 Mar 2024
Trainable Fixed-Point Quantization for Deep Learning Acceleration on
  FPGAs
Trainable Fixed-Point Quantization for Deep Learning Acceleration on FPGAs
Dingyi Dai
Yichi Zhang
Jiahao Zhang
Zhanqiu Hu
Yaohui Cai
Qi Sun
Zhiru Zhang
MQ
69
5
0
31 Jan 2024
Effect of Weight Quantization on Learning Models by Typical Case
  Analysis
Effect of Weight Quantization on Learning Models by Typical Case Analysis
Shuhei Kashiwamura
Ayaka Sakata
Masaaki Imaizumi
MQ
33
1
0
30 Jan 2024
Towards Cheaper Inference in Deep Networks with Lower Bit-Width
  Accumulators
Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators
Yaniv Blumenfeld
Itay Hubara
Daniel Soudry
47
3
0
25 Jan 2024
Enabling On-device Continual Learning with Binary Neural Networks
Enabling On-device Continual Learning with Binary Neural Networks
Lorenzo Vorabbi
Davide Maltoni
Guido Borghi
Stefano Santi
MQ
44
5
0
18 Jan 2024
Knowledge Translation: A New Pathway for Model Compression
Knowledge Translation: A New Pathway for Model Compression
Wujie Sun
Defang Chen
Jiawei Chen
Yan Feng
Chun-Yen Chen
Can Wang
31
0
0
11 Jan 2024
FP8-BERT: Post-Training Quantization for Transformer
FP8-BERT: Post-Training Quantization for Transformer
Jianwei Li
Tianchi Zhang
Ian En-Hsu Yen
Dongkuan Xu
MQ
20
5
0
10 Dec 2023
Low-Precision Mixed-Computation Models for Inference on Edge
Low-Precision Mixed-Computation Models for Inference on Edge
Seyedarmin Azizi
M. Nazemi
M. Kamal
Massoud Pedram
MQ
38
1
0
03 Dec 2023
Improving the Robustness of Quantized Deep Neural Networks to White-Box
  Attacks using Stochastic Quantization and Information-Theoretic Ensemble
  Training
Improving the Robustness of Quantized Deep Neural Networks to White-Box Attacks using Stochastic Quantization and Information-Theoretic Ensemble Training
Saurabh Farkya
Aswin Raghavan
Avi Ziskind
14
0
0
30 Nov 2023
Mirage: An RNS-Based Photonic Accelerator for DNN Training
Mirage: An RNS-Based Photonic Accelerator for DNN Training
Cansu Demirkıran
Guowei Yang
D. Bunandar
Ajay Joshi
34
1
0
29 Nov 2023
PolyThrottle: Energy-efficient Neural Network Inference on Edge Devices
PolyThrottle: Energy-efficient Neural Network Inference on Edge Devices
Minghao Yan
Hongyi Wang
Shivaram Venkataraman
23
0
0
30 Oct 2023
Efficient Low-rank Backpropagation for Vision Transformer Adaptation
Efficient Low-rank Backpropagation for Vision Transformer Adaptation
Yuedong Yang
Hung-Yueh Chiang
Guihong Li
Diana Marculescu
R. Marculescu
34
9
0
26 Sep 2023
EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian
EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian
Ofir Gordon
H. Habi
Arnon Netzer
MQ
41
1
0
20 Sep 2023
On-Device Learning with Binary Neural Networks
On-Device Learning with Binary Neural Networks
Lorenzo Vorabbi
Davide Maltoni
Stefano Santi
MQ
39
4
0
29 Aug 2023
Low-bit Quantization for Deep Graph Neural Networks with
  Smoothness-aware Message Propagation
Low-bit Quantization for Deep Graph Neural Networks with Smoothness-aware Message Propagation
Shuang Wang
B. Eravcı
Rustam Guliyev
Hakan Ferhatosmanoglu
GNN
MQ
27
6
0
29 Aug 2023
ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization
  Using Floating-Point Formats
ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats
Xiaoxia Wu
Z. Yao
Yuxiong He
MQ
35
43
0
19 Jul 2023
Self-Distilled Quantization: Achieving High Compression Rates in
  Transformer-Based Language Models
Self-Distilled Quantization: Achieving High Compression Rates in Transformer-Based Language Models
James OÑeill
Sourav Dutta
VLM
MQ
42
1
0
12 Jul 2023
QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Jorn W. T. Peters
Marios Fournarakis
Markus Nagel
M. V. Baalen
Tijmen Blankevoort
MQ
27
5
0
10 Jul 2023
Adaptive Sharpness-Aware Pruning for Robust Sparse Networks
Adaptive Sharpness-Aware Pruning for Robust Sparse Networks
Anna Bair
Hongxu Yin
Maying Shen
Pavlo Molchanov
J. Álvarez
43
10
0
25 Jun 2023
1234
Next