ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.20533
  4. Cited By
Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence
v1v2v3 (latest)

Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence

26 March 2025
Yijiong Yu
    LRMAIMat
ArXiv (abs)PDFHTML

Papers citing "Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence"

7 / 7 papers shown
Title
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
Gleb Rodionov
Roman Garipov
Alina Shutova
George Yakushev
Erik Schultheis
Vage Egiazarian
Anton Sinitsin
Denis Kuznedelev
Dan Alistarh
LRM
118
5
0
08 Apr 2025
Medusa: Simple LLM Inference Acceleration Framework with Multiple
  Decoding Heads
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Tianle Cai
Yuhong Li
Zhengyang Geng
Hongwu Peng
Jason D. Lee
De-huai Chen
Tri Dao
143
313
0
19 Jan 2024
LongBench: A Bilingual, Multitask Benchmark for Long Context
  Understanding
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
Yushi Bai
Xin Lv
Jiajie Zhang
Hong Lyu
Jiankai Tang
...
Aohan Zeng
Lei Hou
Yuxiao Dong
Jie Tang
Juanzi Li
LLMAGRALM
92
602
0
28 Aug 2023
Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation
Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation
Xuefei Ning
Zinan Lin
Zixuan Zhou
Zifu Wang
Huazhong Yang
Yu Wang
ReLMLRM
25
43
0
28 Jul 2023
FlashAttention-2: Faster Attention with Better Parallelism and Work
  Partitioning
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Tri Dao
LRM
115
1,326
0
17 Jul 2023
Accelerating Transformer Inference for Translation via Parallel Decoding
Accelerating Transformer Inference for Translation via Parallel Decoding
Andrea Santilli
Silvio Severino
Emilian Postolache
Valentino Maiorca
Michele Mancusi
R. Marin
Emanuele Rodolà
94
90
0
17 May 2023
Fast Inference from Transformers via Speculative Decoding
Fast Inference from Transformers via Speculative Decoding
Yaniv Leviathan
Matan Kalman
Yossi Matias
LRM
147
733
0
30 Nov 2022
1