
An Efficient Inference Framework for Early-exit Large Language Models
Papers citing "An Efficient Inference Framework for Early-exit Large Language Models"
17 / 17 papers shown
Title |
---|
![]() Llama 2: Open Foundation and Fine-Tuned Chat Models Hugo Touvron Louis Martin Kevin R. Stone Peter Albert Amjad Almahairi ...Sharan Narang Aurelien Rodriguez Robert Stojnic Sergey Edunov Thomas Scialom |
![]() A Simple Hash-Based Early Exiting Approach For Language Understanding
and Generation Tianxiang Sun Xiangyang Liu Wei-wei Zhu Zhichao Geng Lingling Wu Yilong He Yuan Ni Guotong Xie Xuanjing Huang Xipeng Qiu |