Title |
---|
![]() LLM Inference Unveiled: Survey and Roofline Model Insights Zhihang Yuan Yuzhang Shang Yang Zhou Zhen Dong Zhe Zhou ...Yong Jae Lee Yan Yan Beidi Chen Guangyu Sun Kurt Keutzer |
![]() Ouroboros: Generating Longer Drafts Phrase by Phrase for Faster
Speculative Decoding Weilin Zhao Yuxiang Huang Xu Han Wang Xu Chaojun Xiao Xinrong Zhang Yewei Fang Kaihuo Zhang Zhiyuan Liu Maosong Sun |
![]() Mixtral of Experts Albert Q. Jiang Alexandre Sablayrolles Antoine Roux A. Mensch Blanche Savary ...Théophile Gervet Thibaut Lavril Thomas Wang Timothée Lacroix William El Sayed |
![]() Llama 2: Open Foundation and Fine-Tuned Chat Models Hugo Touvron Louis Martin Kevin R. Stone Peter Albert Amjad Almahairi ...Sharan Narang Aurelien Rodriguez Robert Stojnic Sergey Edunov Thomas Scialom |