Efficient Large Language Model Inference with Neural Block Linearization

27 May 2025

Papers citing "Efficient Large Language Model Inference with Neural Block Linearization"

3 / 3 papers shown

Title
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test Yuhui Li Fangyun Wei Chao Zhang Hongyang R. Zhang 234 18 0 03 Mar 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning DeepSeek-AI Daya Guo Dejian Yang Haowei Zhang Junxiao Song ... Shiyu Wang S. Yu Shunfeng Zhou Shuting Pan S.S. Li ReLM VLM OffRL AI4TS LRM 384 2,022 0 22 Jan 2025
The Unreasonable Ineffectiveness of the Deeper Layers Andrey Gromov Kushal Tirumala Hassan Shapourian Paolo Glorioso Daniel A. Roberts 146 106 0 26 Mar 2024