ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.15208
  4. Cited By
Decoding at the Speed of Thought: Harnessing Parallel Decoding of
  Lexical Units for LLMs

Decoding at the Speed of Thought: Harnessing Parallel Decoding of Lexical Units for LLMs

24 May 2024
Chenxi Sun
Hongzhi Zhang
Zijia Lin
Jingyuan Zhang
Fuzheng Zhang
Zhongyuan Wang
Bin Chen
Chengru Song
Di Zhang
Kun Gai
Deyi Xiong
ArXiv (abs)PDFHTMLGithub (3★)

Papers citing "Decoding at the Speed of Thought: Harnessing Parallel Decoding of Lexical Units for LLMs"

11 / 11 papers shown
Title
Fast Inference from Transformers via Speculative Decoding
Fast Inference from Transformers via Speculative Decoding
Yaniv Leviathan
Matan Kalman
Yossi Matias
LRM
147
733
0
30 Nov 2022
A Survey on Non-Autoregressive Generation for Neural Machine Translation
  and Beyond
A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond
Yisheng Xiao
Lijun Wu
Junliang Guo
Juntao Li
Hao Fei
Tao Qin
Tie-Yan Liu
3DVMedImAI4CE
80
87
0
20 Apr 2022
Sparse is Enough in Scaling Transformers
Sparse is Enough in Scaling Transformers
Sebastian Jaszczur
Aakanksha Chowdhery
Afroz Mohiuddin
Lukasz Kaiser
Wojciech Gajewski
Henryk Michalewski
Jonni Kanerva
MoE
61
102
0
24 Nov 2021
Primer: Searching for Efficient Transformers for Language Modeling
Primer: Searching for Efficient Transformers for Language Modeling
David R. So
Wojciech Mañke
Hanxiao Liu
Zihang Dai
Noam M. Shazeer
Quoc V. Le
VLM
253
156
0
17 Sep 2021
Evaluating Large Language Models Trained on Code
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELMALM
236
5,647
0
07 Jul 2021
Consistent Accelerated Inference via Confident Adaptive Transformers
Consistent Accelerated Inference via Confident Adaptive Transformers
Tal Schuster
Adam Fisch
Tommi Jaakkola
Regina Barzilay
AI4TS
244
72
0
18 Apr 2021
Scaling Laws for Transfer
Scaling Laws for Transfer
Danny Hernandez
Jared Kaplan
T. Henighan
Sam McCandlish
90
250
0
02 Feb 2021
Glancing Transformer for Non-Autoregressive Neural Machine Translation
Glancing Transformer for Non-Autoregressive Neural Machine Translation
Lihua Qian
Hao Zhou
Yu Bao
Mingxuan Wang
Lin Qiu
Weinan Zhang
Yong Yu
Lei Li
95
158
0
18 Aug 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
611
4,905
0
23 Jan 2020
Fast Decoding in Sequence Models using Discrete Latent Variables
Fast Decoding in Sequence Models using Discrete Latent Variables
Łukasz Kaiser
Aurko Roy
Ashish Vaswani
Niki Parmar
Samy Bengio
Jakob Uszkoreit
Noam M. Shazeer
70
232
0
09 Mar 2018
Non-Autoregressive Neural Machine Translation
Non-Autoregressive Neural Machine Translation
Jiatao Gu
James Bradbury
Caiming Xiong
Victor O.K. Li
R. Socher
107
797
0
07 Nov 2017
1