ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.03130
  4. Cited By
Inference Optimizations for Large Language Models: Effects, Challenges,
  and Practical Considerations

Inference Optimizations for Large Language Models: Effects, Challenges, and Practical Considerations

6 August 2024
Leo Donisch
Sigurd Schacht
Carsten Lanquillon
ArXivPDFHTML

Papers citing "Inference Optimizations for Large Language Models: Effects, Challenges, and Practical Considerations"

6 / 6 papers shown
Title
Gumiho: A Hybrid Architecture to Prioritize Early Tokens in Speculative Decoding
Jiajun Li
Yixing Xu
Haiduo Huang
Xuanwu Yin
D. Li
Edith C. -H. Ngai
E. Barsoum
58
0
0
13 Mar 2025
Streaming Looking Ahead with Token-level Self-reward
H. Zhang
Ruixin Hong
Dong Yu
41
1
0
24 Feb 2025
ZeroQuant-V2: Exploring Post-training Quantization in LLMs from
  Comprehensive Study to Low Rank Compensation
ZeroQuant-V2: Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation
Z. Yao
Xiaoxia Wu
Cheng-rong Li
Stephen Youn
Yuxiong He
MQ
63
57
0
15 Mar 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
313
11,953
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
367
8,495
0
28 Jan 2022
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
233
576
0
12 Sep 2019
1