Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.00230
Cited By
Magic Pyramid: Accelerating Inference with Early Exiting and Token Pruning
30 October 2021
Xuanli He
I. Keivanloo
Yi Xu
Xiang He
Belinda Zeng
Santosh Rajagopalan
Trishul Chilimbi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Magic Pyramid: Accelerating Inference with Early Exiting and Token Pruning"
5 / 5 papers shown
Title
Hyper-multi-step: The Truth Behind Difficult Long-context Tasks
Yijiong Yu
Ma Xiufa
Fang Jianwei
Zhi-liang Xu
Su Guangyao
...
Zhixiao Qi
Wei Wang
Wei Liu
Ran Chen
Ji Pei
LRM
RALM
29
0
0
06 Oct 2024
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
Qichen Fu
Minsik Cho
Thomas Merth
Sachin Mehta
Mohammad Rastegari
Mahyar Najibi
52
26
0
19 Jul 2024
Efficiently Controlling Multiple Risks with Pareto Testing
Bracha Laufer-Goldshtein
Adam Fisch
Regina Barzilay
Tommi Jaakkola
36
16
0
14 Oct 2022
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
236
576
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
1