Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.20488
Cited By
FIRP: Faster LLM inference via future intermediate representation prediction
27 October 2024
Pengfei Wu
Jiahao Liu
Zhuocheng Gong
Qifan Wang
Jinpeng Li
Jingang Wang
Xunliang Cai
Dongyan Zhao
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"FIRP: Faster LLM inference via future intermediate representation prediction"
6 / 6 papers shown
Title
Accelerating LLM Inference with Staged Speculative Decoding
Benjamin Spector
Christal Re
69
112
0
08 Aug 2023
Fast Inference from Transformers via Speculative Decoding
Yaniv Leviathan
Matan Kalman
Yossi Matias
LRM
147
733
0
30 Nov 2022
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
323
4,533
0
27 Oct 2021
Datasets: A Community Library for Natural Language Processing
Quentin Lhoest
Albert Villanova del Moral
Yacine Jernite
A. Thakur
Patrick von Platen
...
Thibault Goehringer
Victor Mustar
François Lagunas
Alexander M. Rush
Thomas Wolf
218
613
0
07 Sep 2021
Fast Transformer Decoding: One Write-Head is All You Need
Noam M. Shazeer
154
472
0
06 Nov 2019
Don't Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization
Shashi Narayan
Shay B. Cohen
Mirella Lapata
AILaw
146
1,682
0
27 Aug 2018
1