ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.17353
  4. Cited By
Scalify: scale propagation for efficient low-precision LLM training

Scalify: scale propagation for efficient low-precision LLM training

24 July 2024
Paul Balança
Sam Hosegood
Carlo Luschi
Andrew Fitzgibbon
ArXiv (abs)PDFHTMLGithub (16★)

Papers citing "Scalify: scale propagation for efficient low-precision LLM training"

10 / 10 papers shown
Title
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric
  Strategy for Diverse Generative Tasks
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Xiaoxia Wu
Haojun Xia
Stephen Youn
Zhen Zheng
Shiyang Chen
...
Reza Yazdani Aminabadi
Yuxiong He
Olatunji Ruwase
Leon Song
Zhewei Yao
111
10
0
14 Dec 2023
FP8-LM: Training FP8 Large Language Models
FP8-LM: Training FP8 Large Language Models
Houwen Peng
Kan Wu
Yixuan Wei
Guoshuai Zhao
Yuxiang Yang
...
Zheng Zhang
Shuguang Liu
Joe Chau
Han Hu
Peng Cheng
MQ
91
43
0
27 Oct 2023
Microscaling Data Formats for Deep Learning
Microscaling Data Formats for Deep Learning
B. Rouhani
Ritchie Zhao
Ankit More
Mathew Hall
Alireza Khodamoradi
...
Maxim Naumov
Colin Verilli
Ralph Wittig
Doug Burger
Eric S. Chung
MQ
84
63
0
16 Oct 2023
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained
  Transformers
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Elias Frantar
Saleh Ashkboos
Torsten Hoefler
Dan Alistarh
MQ
129
989
0
31 Oct 2022
FP8 Formats for Deep Learning
FP8 Formats for Deep Learning
Paulius Micikevicius
Dusan Stosic
N. Burgess
Marius Cornea
Pradeep Dubey
...
Naveen Mellempudi
S. Oberman
Mohammad Shoeybi
Michael Siu
Hao Wu
BDLVLMMQ
129
137
0
12 Sep 2022
FP8 Quantization: The Power of the Exponent
FP8 Quantization: The Power of the Exponent
Andrey Kuzmin
M. V. Baalen
Yuwei Ren
Markus Nagel
Jorn W. T. Peters
Tijmen Blankevoort
MQ
64
85
0
19 Aug 2022
8-bit Optimizers via Block-wise Quantization
8-bit Optimizers via Block-wise Quantization
Tim Dettmers
M. Lewis
Sam Shleifer
Luke Zettlemoyer
MQ
117
298
0
06 Oct 2021
A Study of BFLOAT16 for Deep Learning Training
A Study of BFLOAT16 for Deep Learning Training
Dhiraj D. Kalamkar
Dheevatsa Mudigere
Naveen Mellempudi
Dipankar Das
K. Banerjee
...
Sudarshan Srinivasan
Abhisek Kundu
M. Smelyanskiy
Bharat Kaul
Pradeep Dubey
MQ
83
346
0
29 May 2019
Pointer Sentinel Mixture Models
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
328
2,876
0
26 Sep 2016
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed
  Systems
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
Martín Abadi
Ashish Agarwal
P. Barham
E. Brevdo
Zhiwen Chen
...
Pete Warden
Martin Wattenberg
Martin Wicke
Yuan Yu
Xiaoqiang Zheng
276
11,151
0
14 Mar 2016
1