Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2501.07139
Cited By
FlexQuant: Elastic Quantization Framework for Locally Hosted LLM on Edge Devices
13 January 2025
Yuji Chai
Mujin Kwen
David Brooks
Gu-Yeon Wei
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"FlexQuant: Elastic Quantization Framework for Locally Hosted LLM on Edge Devices"
12 / 12 papers shown
Title
Sustainable LLM Inference for Edge AI: Evaluating Quantized LLMs for Energy Efficiency, Output Accuracy, and Inference Latency
E. J. Husom
Arda Goknil
Merve Astekin
Lwin Khin Shar
Andre Kåsen
S. Sen
Benedikt Andreas Mithassel
Ahmet Soylu
MQ
77
2
0
04 Apr 2025
Debt Collection Negotiations with Large Language Models: An Evaluation System and Optimizing Decision Making with Multi-Agent
Xiaofeng Wang
Zizhuo Zhang
Jinguang Zheng
Yiming Ai
Rui Wang
76
2
0
25 Feb 2025
LowRA: Accurate and Efficient LoRA Fine-Tuning of LLMs under 2 Bits
Zikai Zhou
Qizheng Zhang
Hermann Kumbong
Kunle Olukotun
MQ
578
0
0
12 Feb 2025
Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
Yeonhong Park
Jake Hyun
SangLyul Cho
Bonggeun Sim
Jae W. Lee
MQ
78
19
0
16 Feb 2024
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDL
MQ
108
578
0
01 Jun 2023
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Elias Frantar
Saleh Ashkboos
Torsten Hoefler
Dan Alistarh
MQ
152
1,008
0
31 Oct 2022
ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
Z. Yao
Reza Yazdani Aminabadi
Minjia Zhang
Xiaoxia Wu
Conglong Li
Yuxiong He
VLM
MQ
142
479
0
04 Jun 2022
A White Paper on Neural Network Quantization
Markus Nagel
Marios Fournarakis
Rana Ali Amjad
Yelysei Bondarenko
M. V. Baalen
Tijmen Blankevoort
MQ
95
546
0
15 Jun 2021
Up or Down? Adaptive Rounding for Post-Training Quantization
Markus Nagel
Rana Ali Amjad
M. V. Baalen
Christos Louizos
Tijmen Blankevoort
MQ
95
588
0
22 Apr 2020
PIQA: Reasoning about Physical Commonsense in Natural Language
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
OOD
LRM
192
1,847
0
26 Nov 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
506
20,376
0
23 Oct 2019
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
349
2,900
0
26 Sep 2016
1