Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.02861
Cited By
8-bit Optimizers via Block-wise Quantization
6 October 2021
Tim Dettmers
M. Lewis
Sam Shleifer
Luke Zettlemoyer
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"8-bit Optimizers via Block-wise Quantization"
50 / 203 papers shown
Title
Parameter-Efficient Fine-Tuning via Selective Discrete Cosine Transform
Yixian Shen
Qi Bi
Jia-Hong Huang
Hongyi Zhu
Anuj Pathania
31
1
0
09 Oct 2024
Margin Matching Preference Optimization: Enhanced Model Alignment with Granular Feedback
Kyuyoung Kim
Ah Jeong Seo
Hao Liu
Jinwoo Shin
Kimin Lee
22
2
0
04 Oct 2024
SOAP: Improving and Stabilizing Shampoo using Adam
Nikhil Vyas
Depen Morwani
Rosie Zhao
Itai Shapira
David Brandfonbrener
Lucas Janson
Sham Kakade
Sham Kakade
72
23
0
17 Sep 2024
Stable Language Model Pre-training by Reducing Embedding Variability
Woojin Chung
Jiwoo Hong
Na Min An
James Thorne
Se-Young Yun
30
2
0
12 Sep 2024
SpeechTaxi: On Multilingual Semantic Speech Classification
Lennart Keller
Goran Glavaš
26
0
0
10 Sep 2024
DriveScape: Towards High-Resolution Controllable Multi-View Driving Video Generation
Wei Yu Wu
Xi Guo
Weixuan Tang
Tingxuan Huang
Chiyu Wang
Dongyue Chen
C. Ding
VGen
32
6
0
09 Sep 2024
Exploring Foundation Models for Synthetic Medical Imaging: A Study on Chest X-Rays and Fine-Tuning Techniques
Davide Clode da Silva
Marina Musse Bernardes
Nathalia Giacomini Ceretta
Gabriel Vaz de Souza
Gabriel Fonseca Silva
Rafael Heitor Bordini
S. Musse
MedIm
LM&MA
31
0
0
06 Sep 2024
Masked Mixers for Language Generation and Retrieval
Benjamin L. Badger
47
0
0
02 Sep 2024
Predicting the Target Word of Game-playing Conversations using a Low-Rank Dialect Adapter for Decoder Models
Dipankar Srirag
Aditya Joshi
Jacob Eisenstein
46
1
0
31 Aug 2024
Memory-Efficient LLM Training with Online Subspace Descent
Kaizhao Liang
Bo Liu
Lizhang Chen
Qiang Liu
29
7
0
23 Aug 2024
Demystifying the Communication Characteristics for Distributed Transformer Models
Quentin G. Anthony
Benjamin Michalowicz
Jacob Hatef
Lang Xu
Mustafa Abduljabbar
Hari Subramoni
Hari Subramoni
D. Panda
AI4CE
36
2
0
19 Aug 2024
MGH Radiology Llama: A Llama 3 70B Model for Radiology
Yucheng Shi
Peng Shu
Zhengliang Liu
Zihao Wu
Quanzheng Li
Xiang Li
LM&MA
22
0
0
13 Aug 2024
Large Language Models for Anomaly Detection in Computational Workflows: from Supervised Fine-Tuning to In-Context Learning
Hongwei Jin
George Papadimitriou
Krishnan Raghavan
Pawel Zuk
Prasanna Balaprakash
Cong Wang
A. Mandal
Ewa Deelman
38
1
0
24 Jul 2024
Scalify: scale propagation for efficient low-precision LLM training
Paul Balança
Sam Hosegood
Carlo Luschi
Andrew Fitzgibbon
26
2
0
24 Jul 2024
Exploring Quantization for Efficient Pre-Training of Transformer Language Models
Kamran Chitsaz
Quentin Fournier
Gonccalo Mordido
Sarath Chandar
MQ
49
3
0
16 Jul 2024
Boosting Zero-Shot Crosslingual Performance using LLM-Based Augmentations with Effective Data Selection
Barah Fazili
Ashish Agrawal
P. Jyothi
37
1
0
15 Jul 2024
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Zhenyu (Allen) Zhang
Ajay Jaiswal
L. Yin
Shiwei Liu
Jiawei Zhao
Yuandong Tian
Zhangyang Wang
VLM
33
16
0
11 Jul 2024
LLMBox: A Comprehensive Library for Large Language Models
Tianyi Tang
Yiwen Hu
Bingqian Li
Wenyang Luo
Zijing Qin
...
Chunxuan Xia
Junyi Li
Kun Zhou
Wayne Xin Zhao
Ji-Rong Wen
31
1
0
08 Jul 2024
Automated Text Scoring in the Age of Generative AI for the GPU-poor
C. Ormerod
Alexander Kwako
46
2
0
02 Jul 2024
Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients
Aashiq Muhamed
Oscar Li
David Woodruff
Mona Diab
Virginia Smith
55
7
0
25 Jun 2024
Adam-mini: Use Fewer Learning Rates To Gain More
Yushun Zhang
Congliang Chen
Ziniu Li
Tian Ding
Chenwei Wu
Yinyu Ye
Zhi-Quan Luo
Ruoyu Sun
36
37
0
24 Jun 2024
Self-Distillation for Model Stacking Unlocks Cross-Lingual NLU in 200+ Languages
Fabian David Schmidt
Philipp Borchert
Ivan Vulić
Goran Glavaš
42
5
0
18 Jun 2024
GNOME: Generating Negotiations through Open-Domain Mapping of Exchanges
Darshan Deshpande
Shambhavi Sinha
Anirudh Ravi Kumar
Debaditya Pal
Jonathan May
AI4CE
53
0
0
16 Jun 2024
H-Fac: Memory-Efficient Optimization with Factorized Hamiltonian Descent
Son Nguyen
Lizhang Chen
Bo Liu
Qiang Liu
30
3
0
14 Jun 2024
RVT-2: Learning Precise Manipulation from Few Demonstrations
Ankit Goyal
Valts Blukis
Jie Xu
Yijie Guo
Yu-Wei Chao
Dieter Fox
35
38
0
12 Jun 2024
Scaling up masked audio encoder learning for general audio classification
Heinrich Dinkel
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
Bin Wang
50
3
0
11 Jun 2024
Margin-aware Preference Optimization for Aligning Diffusion Models without Reference
Jiwoo Hong
Sayak Paul
Noah Lee
Kashif Rasul
James Thorne
Jongheon Jeong
43
13
0
10 Jun 2024
Navigating Efficiency in MobileViT through Gaussian Process on Global Architecture Factors
Ke Meng
Kai Chen
37
0
0
07 Jun 2024
Error-preserving Automatic Speech Recognition of Young English Learners' Language
Janick Michot
Manuela Hurlimann
Jan Deriu
Luzia Sauer
Katsiaryna Mlynchyk
Mark Cieliebak
26
2
0
05 Jun 2024
SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining
Andi Han
Jiaxiang Li
Wei Huang
Mingyi Hong
Akiko Takeda
Pratik Jawanpuria
Bamdev Mishra
41
10
0
04 Jun 2024
Diffusion-based Image Generation for In-distribution Data Augmentation in Surface Defect Detection
Luigi Capogrosso
Federico Girella
Francesco Taioli
Michele Chiara
Muhammad Aqeel
Franco Fummi
Francesco Setti
Marco Cristani
44
4
0
01 Jun 2024
4-bit Shampoo for Memory-Efficient Network Training
Sike Wang
Jia Li
Pan Zhou
Hua Huang
MQ
41
5
0
28 May 2024
VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections
Roy Miles
Pradyumna Reddy
Ismail Elezi
Jiankang Deng
VLM
37
3
0
28 May 2024
LoQT: Low Rank Adapters for Quantized Training
Sebastian Loeschcke
M. Toftrup
M. Kastoryano
Serge J. Belongie
Vésteinn Snæbjarnarson
MQ
36
3
0
26 May 2024
InstructPatentGPT: Training patent language models to follow instructions with human feedback
Jieh-Sheng Lee
ALM
41
6
0
25 May 2024
MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence
Ionut-Vlad Modoranu
M. Safaryan
Grigory Malinovsky
Eldar Kurtic
Thomas Robert
Peter Richtárik
Dan Alistarh
MQ
37
12
0
24 May 2024
Self-supervised vision-langage alignment of deep learning representations for bone X-rays analysis
A. Englebert
Anne-Sophie Collin
O. Cornu
Christophe De Vleeschouwer
34
1
0
14 May 2024
PLeak: Prompt Leaking Attacks against Large Language Model Applications
Bo Hui
Haolin Yuan
Neil Gong
Philippe Burlina
Yinzhi Cao
LLMAG
AAML
SILM
31
34
0
10 May 2024
Pruning as a Domain-specific LLM Extractor
Nan Zhang
Yanchi Liu
Xujiang Zhao
Wei Cheng
Runxue Bao
Rui Zhang
Prasenjit Mitra
Haifeng Chen
26
9
0
10 May 2024
Evaluating Dialect Robustness of Language Models via Conversation Understanding
Dipankar Srirag
Aditya Joshi
37
1
0
09 May 2024
Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
Jordan Dotzel
Yuzong Chen
Bahaa Kotb
Sushma Prasad
Gang Wu
Sheng Li
Mohamed S. Abdelfattah
Zhiru Zhang
31
8
0
06 May 2024
Large Language Models for Next Point-of-Interest Recommendation
Peibo Li
Maarten de Rijke
Hao Xue
Shuang Ao
Yang Song
Flora D. Salim
66
16
0
19 Apr 2024
Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models
Tanmay Gautam
Youngsuk Park
Hao Zhou
Parameswaran Raman
Wooseok Ha
43
11
0
11 Apr 2024
Automated Federated Pipeline for Parameter-Efficient Fine-Tuning of Large Language Models
Zihan Fang
Zheng Lin
Zhe Chen
Xianhao Chen
Yue Gao
Yuguang Fang
54
35
0
09 Apr 2024
Minimize Quantization Output Error with Bias Compensation
Cheng Gong
Haoshuai Zheng
Mengting Hu
Zheng Lin
Deng-Ping Fan
Yuzhi Zhang
Tao Li
MQ
38
2
0
02 Apr 2024
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Yaowei Zheng
Richong Zhang
Junhao Zhang
Yanhan Ye
Zheyan Luo
Zhangchi Feng
Yongqiang Ma
37
383
0
20 Mar 2024
Characteristic AI Agents via Large Language Models
Xi Wang
Hongliang Dai
Shen Gao
Piji Li
43
3
0
19 Mar 2024
EffiPerception: an Efficient Framework for Various Perception Tasks
Xinhao Xiang
Simon Dräger
Jiawei Zhang
VLM
37
0
0
18 Mar 2024
Adversarial Fine-tuning of Compressed Neural Networks for Joint Improvement of Robustness and Efficiency
Hallgrimur Thorsteinsson
Valdemar J Henriksen
Tong Chen
Raghavendra Selvan
AAML
40
1
0
14 Mar 2024
Stealing Part of a Production Language Model
Nicholas Carlini
Daniel Paleka
Krishnamurthy Dvijotham
Thomas Steinke
Jonathan Hayase
...
Arthur Conmy
Itay Yona
Eric Wallace
David Rolnick
Florian Tramèr
MLAU
AAML
27
71
0
11 Mar 2024
Previous
1
2
3
4
5
Next