Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.08845
Cited By
Bifurcated Attention: Accelerating Massively Parallel Decoding with Shared Prefixes in LLMs
13 March 2024
Ben Athiwaratkun
Sujan Kumar Gonugondla
Sanjay Krishna Gouda
Haifeng Qian
Hantian Ding
Qing Sun
Jun Wang
Jiacheng Guo
Liangfu Chen
Parminder Bhatia
Ramesh Nallapati
Sudipta Sengupta
Bing Xiang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Bifurcated Attention: Accelerating Massively Parallel Decoding with Shared Prefixes in LLMs"
14 / 14 papers shown
Title
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Bradley Brown
Jordan Juravsky
Ryan Ehrlich
Ronald Clark
Quoc V. Le
Christopher Ré
Azalia Mirhoseini
ALM
LRM
81
220
0
03 Jan 2025
DeFT: Decoding with Flash Tree-attention for Efficient Tree-structured LLM Inference
Jinwei Yao
Kaiqi Chen
Kexun Zhang
Jiaxuan You
Binhang Yuan
Zeke Wang
Tao Lin
35
2
0
30 Mar 2024
Hydragen: High-Throughput LLM Inference with Shared Prefixes
Jordan Juravsky
Bradley Brown
Ryan Ehrlich
Daniel Y. Fu
Christopher Ré
Azalia Mirhoseini
58
36
0
07 Feb 2024
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Yichao Fu
Peter Bailis
Ion Stoica
Hao Zhang
127
141
0
03 Feb 2024
CodeGen2: Lessons for Training LLMs on Programming and Natural Languages
Erik Nijkamp
A. Ghobadzadeh
Caiming Xiong
Silvio Savarese
Yingbo Zhou
152
164
0
03 May 2023
Multi-lingual Evaluation of Code Generation Models
Ben Athiwaratkun
Sanjay Krishna Gouda
Zijian Wang
Xiaopeng Li
Yuchen Tian
...
Baishakhi Ray
Parminder Bhatia
Sudipta Sengupta
Dan Roth
Bing Xiang
ELM
114
160
0
26 Oct 2022
Co-Writing Screenplays and Theatre Scripts with Language Models: An Evaluation by Industry Professionals
Piotr Wojciech Mirowski
Kory W. Mathewson
Jaylen Pittman
Richard Evans
HAI
62
250
0
29 Sep 2022
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning
Hung Le
Yue Wang
Akhilesh Deepak Gotmare
Silvio Savarese
S. Hoi
SyDa
ALM
129
240
0
05 Jul 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
314
3,248
0
21 Mar 2022
Teaching language models to support answers with verified quotes
Jacob Menick
Maja Trebacz
Vladimir Mikulik
John Aslanides
Francis Song
...
Mia Glaese
Susannah Young
Lucy Campbell-Gillingham
G. Irving
Nat McAleese
ELM
RALM
240
257
0
21 Mar 2022
Facebook AI WMT21 News Translation Task Submission
C. Tran
Shruti Bhosale
James Cross
Philipp Koehn
Sergey Edunov
Angela Fan
VLM
134
81
0
06 Aug 2021
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
280
2,015
0
28 Jul 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
240
4,469
0
23 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,821
0
17 Sep 2019
1