Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.20041
Cited By
v1
v2
v3 (latest)
Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs
29 March 2024
Luchang Li
Sheng Qian
Jie Lu
Lunxi Yuan
Rui Wang
Qin Xie
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs"
12 / 12 papers shown
Title
Small Language Models: Survey, Measurements, and Insights
Zhenyan Lu
Xiang Li
Dongqi Cai
Rongjie Yi
Fangming Liu
Xiwen Zhang
Nicholas D. Lane
Mengwei Xu
ObjD
LRM
140
58
0
24 Sep 2024
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
Yuhui Li
Fangyun Wei
Chao Zhang
Hongyang R. Zhang
144
165
0
26 Jan 2024
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Tianle Cai
Yuhong Li
Zhengyang Geng
Hongwu Peng
Jason D. Lee
De-huai Chen
Tri Dao
174
314
0
19 Jan 2024
DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving
Yinmin Zhong
Shengyu Liu
Junda Chen
Jianbo Hu
Yibo Zhu
Xuanzhe Liu
Xin Jin
Hao Zhang
92
205
0
18 Jan 2024
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDL
MQ
110
578
0
01 Jun 2023
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
235
609
0
22 May 2023
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
417
2,396
0
09 Nov 2022
DISC: A Dynamic Shape Compiler for Machine Learning Workloads
Kai Zhu
Wenyi Zhao
Zhen Zheng
Tianyou Guo
Pengzhan Zhao
...
Junjie Bai
Jun Yang
Xiaoyong Liu
Lansong Diao
Wei Lin
57
28
0
09 Mar 2021
MNN: A Universal and Efficient Inference Engine
Xiaotang Jiang
Huan Wang
Yiliu Chen
Ziqi Wu
Lichuan Wang
...
Zongyang Cui
Yuezhi Cai
Tianhang Yu
Chengfei Lv
Zhihua Wu
92
157
0
27 Feb 2020
Deep Learning on Mobile Devices - A Review
Yunbin Deng
58
121
0
21 Mar 2019
Deep Learning Towards Mobile Applications
Ji Wang
Bokai Cao
Philip S. Yu
Lichao Sun
Weidong Bao
Xiaomin Zhu
HAI
79
99
0
10 Sep 2018
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
1.2K
20,918
0
17 Apr 2017
1