ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.02861
  4. Cited By
8-bit Optimizers via Block-wise Quantization

8-bit Optimizers via Block-wise Quantization

6 October 2021
Tim Dettmers
M. Lewis
Sam Shleifer
Luke Zettlemoyer
    MQ
ArXivPDFHTML

Papers citing "8-bit Optimizers via Block-wise Quantization"

50 / 203 papers shown
Title
Parameter-Efficient Fine-Tuning via Selective Discrete Cosine Transform
Parameter-Efficient Fine-Tuning via Selective Discrete Cosine Transform
Yixian Shen
Qi Bi
Jia-Hong Huang
Hongyi Zhu
Anuj Pathania
31
1
0
09 Oct 2024
Margin Matching Preference Optimization: Enhanced Model Alignment with
  Granular Feedback
Margin Matching Preference Optimization: Enhanced Model Alignment with Granular Feedback
Kyuyoung Kim
Ah Jeong Seo
Hao Liu
Jinwoo Shin
Kimin Lee
22
2
0
04 Oct 2024
SOAP: Improving and Stabilizing Shampoo using Adam
SOAP: Improving and Stabilizing Shampoo using Adam
Nikhil Vyas
Depen Morwani
Rosie Zhao
Itai Shapira
David Brandfonbrener
Lucas Janson
Sham Kakade
Sham Kakade
72
23
0
17 Sep 2024
Stable Language Model Pre-training by Reducing Embedding Variability
Stable Language Model Pre-training by Reducing Embedding Variability
Woojin Chung
Jiwoo Hong
Na Min An
James Thorne
Se-Young Yun
30
2
0
12 Sep 2024
SpeechTaxi: On Multilingual Semantic Speech Classification
SpeechTaxi: On Multilingual Semantic Speech Classification
Lennart Keller
Goran Glavaš
26
0
0
10 Sep 2024
DriveScape: Towards High-Resolution Controllable Multi-View Driving
  Video Generation
DriveScape: Towards High-Resolution Controllable Multi-View Driving Video Generation
Wei Yu Wu
Xi Guo
Weixuan Tang
Tingxuan Huang
Chiyu Wang
Dongyue Chen
C. Ding
VGen
32
6
0
09 Sep 2024
Exploring Foundation Models for Synthetic Medical Imaging: A Study on
  Chest X-Rays and Fine-Tuning Techniques
Exploring Foundation Models for Synthetic Medical Imaging: A Study on Chest X-Rays and Fine-Tuning Techniques
Davide Clode da Silva
Marina Musse Bernardes
Nathalia Giacomini Ceretta
Gabriel Vaz de Souza
Gabriel Fonseca Silva
Rafael Heitor Bordini
S. Musse
MedIm
LM&MA
31
0
0
06 Sep 2024
Masked Mixers for Language Generation and Retrieval
Masked Mixers for Language Generation and Retrieval
Benjamin L. Badger
47
0
0
02 Sep 2024
Predicting the Target Word of Game-playing Conversations using a Low-Rank Dialect Adapter for Decoder Models
Predicting the Target Word of Game-playing Conversations using a Low-Rank Dialect Adapter for Decoder Models
Dipankar Srirag
Aditya Joshi
Jacob Eisenstein
46
1
0
31 Aug 2024
Memory-Efficient LLM Training with Online Subspace Descent
Memory-Efficient LLM Training with Online Subspace Descent
Kaizhao Liang
Bo Liu
Lizhang Chen
Qiang Liu
29
7
0
23 Aug 2024
Demystifying the Communication Characteristics for Distributed
  Transformer Models
Demystifying the Communication Characteristics for Distributed Transformer Models
Quentin G. Anthony
Benjamin Michalowicz
Jacob Hatef
Lang Xu
Mustafa Abduljabbar
Hari Subramoni
Hari Subramoni
D. Panda
AI4CE
36
2
0
19 Aug 2024
MGH Radiology Llama: A Llama 3 70B Model for Radiology
MGH Radiology Llama: A Llama 3 70B Model for Radiology
Yucheng Shi
Peng Shu
Zhengliang Liu
Zihao Wu
Quanzheng Li
Xiang Li
LM&MA
22
0
0
13 Aug 2024
Large Language Models for Anomaly Detection in Computational Workflows:
  from Supervised Fine-Tuning to In-Context Learning
Large Language Models for Anomaly Detection in Computational Workflows: from Supervised Fine-Tuning to In-Context Learning
Hongwei Jin
George Papadimitriou
Krishnan Raghavan
Pawel Zuk
Prasanna Balaprakash
Cong Wang
A. Mandal
Ewa Deelman
38
1
0
24 Jul 2024
Scalify: scale propagation for efficient low-precision LLM training
Scalify: scale propagation for efficient low-precision LLM training
Paul Balança
Sam Hosegood
Carlo Luschi
Andrew Fitzgibbon
26
2
0
24 Jul 2024
Exploring Quantization for Efficient Pre-Training of Transformer
  Language Models
Exploring Quantization for Efficient Pre-Training of Transformer Language Models
Kamran Chitsaz
Quentin Fournier
Gonccalo Mordido
Sarath Chandar
MQ
49
3
0
16 Jul 2024
Boosting Zero-Shot Crosslingual Performance using LLM-Based
  Augmentations with Effective Data Selection
Boosting Zero-Shot Crosslingual Performance using LLM-Based Augmentations with Effective Data Selection
Barah Fazili
Ashish Agrawal
P. Jyothi
37
1
0
15 Jul 2024
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive
  Low-Rank Gradients
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Zhenyu (Allen) Zhang
Ajay Jaiswal
L. Yin
Shiwei Liu
Jiawei Zhao
Yuandong Tian
Zhangyang Wang
VLM
33
16
0
11 Jul 2024
LLMBox: A Comprehensive Library for Large Language Models
LLMBox: A Comprehensive Library for Large Language Models
Tianyi Tang
Yiwen Hu
Bingqian Li
Wenyang Luo
Zijing Qin
...
Chunxuan Xia
Junyi Li
Kun Zhou
Wayne Xin Zhao
Ji-Rong Wen
31
1
0
08 Jul 2024
Automated Text Scoring in the Age of Generative AI for the GPU-poor
Automated Text Scoring in the Age of Generative AI for the GPU-poor
C. Ormerod
Alexander Kwako
46
2
0
02 Jul 2024
Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse
  Gradients
Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients
Aashiq Muhamed
Oscar Li
David Woodruff
Mona Diab
Virginia Smith
55
7
0
25 Jun 2024
Adam-mini: Use Fewer Learning Rates To Gain More
Adam-mini: Use Fewer Learning Rates To Gain More
Yushun Zhang
Congliang Chen
Ziniu Li
Tian Ding
Chenwei Wu
Yinyu Ye
Zhi-Quan Luo
Ruoyu Sun
36
37
0
24 Jun 2024
Self-Distillation for Model Stacking Unlocks Cross-Lingual NLU in 200+
  Languages
Self-Distillation for Model Stacking Unlocks Cross-Lingual NLU in 200+ Languages
Fabian David Schmidt
Philipp Borchert
Ivan Vulić
Goran Glavaš
42
5
0
18 Jun 2024
GNOME: Generating Negotiations through Open-Domain Mapping of Exchanges
GNOME: Generating Negotiations through Open-Domain Mapping of Exchanges
Darshan Deshpande
Shambhavi Sinha
Anirudh Ravi Kumar
Debaditya Pal
Jonathan May
AI4CE
53
0
0
16 Jun 2024
H-Fac: Memory-Efficient Optimization with Factorized Hamiltonian Descent
H-Fac: Memory-Efficient Optimization with Factorized Hamiltonian Descent
Son Nguyen
Lizhang Chen
Bo Liu
Qiang Liu
30
3
0
14 Jun 2024
RVT-2: Learning Precise Manipulation from Few Demonstrations
RVT-2: Learning Precise Manipulation from Few Demonstrations
Ankit Goyal
Valts Blukis
Jie Xu
Yijie Guo
Yu-Wei Chao
Dieter Fox
35
38
0
12 Jun 2024
Scaling up masked audio encoder learning for general audio
  classification
Scaling up masked audio encoder learning for general audio classification
Heinrich Dinkel
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
Bin Wang
50
3
0
11 Jun 2024
Margin-aware Preference Optimization for Aligning Diffusion Models
  without Reference
Margin-aware Preference Optimization for Aligning Diffusion Models without Reference
Jiwoo Hong
Sayak Paul
Noah Lee
Kashif Rasul
James Thorne
Jongheon Jeong
43
13
0
10 Jun 2024
Navigating Efficiency in MobileViT through Gaussian Process on Global
  Architecture Factors
Navigating Efficiency in MobileViT through Gaussian Process on Global Architecture Factors
Ke Meng
Kai Chen
37
0
0
07 Jun 2024
Error-preserving Automatic Speech Recognition of Young English Learners'
  Language
Error-preserving Automatic Speech Recognition of Young English Learners' Language
Janick Michot
Manuela Hurlimann
Jan Deriu
Luzia Sauer
Katsiaryna Mlynchyk
Mark Cieliebak
26
2
0
05 Jun 2024
SLTrain: a sparse plus low-rank approach for parameter and memory
  efficient pretraining
SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining
Andi Han
Jiaxiang Li
Wei Huang
Mingyi Hong
Akiko Takeda
Pratik Jawanpuria
Bamdev Mishra
41
10
0
04 Jun 2024
Diffusion-based Image Generation for In-distribution Data Augmentation
  in Surface Defect Detection
Diffusion-based Image Generation for In-distribution Data Augmentation in Surface Defect Detection
Luigi Capogrosso
Federico Girella
Francesco Taioli
Michele Chiara
Muhammad Aqeel
Franco Fummi
Francesco Setti
Marco Cristani
44
4
0
01 Jun 2024
4-bit Shampoo for Memory-Efficient Network Training
4-bit Shampoo for Memory-Efficient Network Training
Sike Wang
Jia Li
Pan Zhou
Hua Huang
MQ
41
5
0
28 May 2024
VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections
VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections
Roy Miles
Pradyumna Reddy
Ismail Elezi
Jiankang Deng
VLM
37
3
0
28 May 2024
LoQT: Low Rank Adapters for Quantized Training
LoQT: Low Rank Adapters for Quantized Training
Sebastian Loeschcke
M. Toftrup
M. Kastoryano
Serge J. Belongie
Vésteinn Snæbjarnarson
MQ
36
3
0
26 May 2024
InstructPatentGPT: Training patent language models to follow
  instructions with human feedback
InstructPatentGPT: Training patent language models to follow instructions with human feedback
Jieh-Sheng Lee
ALM
41
6
0
25 May 2024
MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and
  Provable Convergence
MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence
Ionut-Vlad Modoranu
M. Safaryan
Grigory Malinovsky
Eldar Kurtic
Thomas Robert
Peter Richtárik
Dan Alistarh
MQ
37
12
0
24 May 2024
Self-supervised vision-langage alignment of deep learning
  representations for bone X-rays analysis
Self-supervised vision-langage alignment of deep learning representations for bone X-rays analysis
A. Englebert
Anne-Sophie Collin
O. Cornu
Christophe De Vleeschouwer
34
1
0
14 May 2024
PLeak: Prompt Leaking Attacks against Large Language Model Applications
PLeak: Prompt Leaking Attacks against Large Language Model Applications
Bo Hui
Haolin Yuan
Neil Gong
Philippe Burlina
Yinzhi Cao
LLMAG
AAML
SILM
31
34
0
10 May 2024
Pruning as a Domain-specific LLM Extractor
Pruning as a Domain-specific LLM Extractor
Nan Zhang
Yanchi Liu
Xujiang Zhao
Wei Cheng
Runxue Bao
Rui Zhang
Prasenjit Mitra
Haifeng Chen
26
9
0
10 May 2024
Evaluating Dialect Robustness of Language Models via Conversation
  Understanding
Evaluating Dialect Robustness of Language Models via Conversation Understanding
Dipankar Srirag
Aditya Joshi
37
1
0
09 May 2024
Learning from Students: Applying t-Distributions to Explore Accurate and
  Efficient Formats for LLMs
Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
Jordan Dotzel
Yuzong Chen
Bahaa Kotb
Sushma Prasad
Gang Wu
Sheng Li
Mohamed S. Abdelfattah
Zhiru Zhang
31
8
0
06 May 2024
Large Language Models for Next Point-of-Interest Recommendation
Large Language Models for Next Point-of-Interest Recommendation
Peibo Li
Maarten de Rijke
Hao Xue
Shuang Ao
Yang Song
Flora D. Salim
66
16
0
19 Apr 2024
Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models
Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models
Tanmay Gautam
Youngsuk Park
Hao Zhou
Parameswaran Raman
Wooseok Ha
43
11
0
11 Apr 2024
Automated Federated Pipeline for Parameter-Efficient Fine-Tuning of
  Large Language Models
Automated Federated Pipeline for Parameter-Efficient Fine-Tuning of Large Language Models
Zihan Fang
Zheng Lin
Zhe Chen
Xianhao Chen
Yue Gao
Yuguang Fang
54
35
0
09 Apr 2024
Minimize Quantization Output Error with Bias Compensation
Minimize Quantization Output Error with Bias Compensation
Cheng Gong
Haoshuai Zheng
Mengting Hu
Zheng Lin
Deng-Ping Fan
Yuzhi Zhang
Tao Li
MQ
38
2
0
02 Apr 2024
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Yaowei Zheng
Richong Zhang
Junhao Zhang
Yanhan Ye
Zheyan Luo
Zhangchi Feng
Yongqiang Ma
37
383
0
20 Mar 2024
Characteristic AI Agents via Large Language Models
Characteristic AI Agents via Large Language Models
Xi Wang
Hongliang Dai
Shen Gao
Piji Li
43
3
0
19 Mar 2024
EffiPerception: an Efficient Framework for Various Perception Tasks
EffiPerception: an Efficient Framework for Various Perception Tasks
Xinhao Xiang
Simon Dräger
Jiawei Zhang
VLM
37
0
0
18 Mar 2024
Adversarial Fine-tuning of Compressed Neural Networks for Joint
  Improvement of Robustness and Efficiency
Adversarial Fine-tuning of Compressed Neural Networks for Joint Improvement of Robustness and Efficiency
Hallgrimur Thorsteinsson
Valdemar J Henriksen
Tong Chen
Raghavendra Selvan
AAML
40
1
0
14 Mar 2024
Stealing Part of a Production Language Model
Stealing Part of a Production Language Model
Nicholas Carlini
Daniel Paleka
Krishnamurthy Dvijotham
Thomas Steinke
Jonathan Hayase
...
Arthur Conmy
Itay Yona
Eric Wallace
David Rolnick
Florian Tramèr
MLAU
AAML
27
71
0
11 Mar 2024
Previous
12345
Next