Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.12470
Cited By
Easy and Efficient Transformer : Scalable Inference Solution For large NLP model
26 April 2021
GongZheng Li
Yadong Xi
Jingzhen Ding
Duan Wang
Bai Liu
Changjie Fan
Xiaoxi Mao
Zeng Zhao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Easy and Efficient Transformer : Scalable Inference Solution For large NLP model"
6 / 6 papers shown
Title
iServe: An Intent-based Serving System for LLMs
Dimitrios Liakopoulos
Tianrui Hu
Prasoon Sinha
N. Yadwadkar
VLM
211
0
0
08 Jan 2025
Transformer Uncertainty Estimation with Hierarchical Stochastic Attention
Jiahuan Pei
Cheng-Yu Wang
Gyuri Szarvas
24
22
0
27 Dec 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
255
4,796
0
24 Feb 2021
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
264
4,489
0
23 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,826
0
17 Sep 2019
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
236
576
0
12 Sep 2019
1