ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.01670
  4. Cited By
A Simple Hash-Based Early Exiting Approach For Language Understanding
  and Generation

A Simple Hash-Based Early Exiting Approach For Language Understanding and Generation

3 March 2022
Tianxiang Sun
Xiangyang Liu
Wei-wei Zhu
Zhichao Geng
Lingling Wu
Yilong He
Yuan Ni
Guotong Xie
Xuanjing Huang
Xipeng Qiu
ArXivPDFHTML

Papers citing "A Simple Hash-Based Early Exiting Approach For Language Understanding and Generation"

33 / 33 papers shown
Title
PARA: Parameter-Efficient Fine-tuning with Prompt Aware Representation Adjustment
PARA: Parameter-Efficient Fine-tuning with Prompt Aware Representation Adjustment
Zequan Liu
Yi Zhao
Ming Tan
Wei Zhu
Aaron Xuxiang Tian
39
0
0
03 Feb 2025
BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as Experts
BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as Experts
Divya J. Bajpai
M. Hanawal
76
0
0
02 Feb 2025
GREEN-CODE: Learning to Optimize Energy Efficiency in LLM-based Code Generation
GREEN-CODE: Learning to Optimize Energy Efficiency in LLM-based Code Generation
Shashikant Ilager
Lukas Florian Briem
Ivona Brandić
34
0
0
19 Jan 2025
COSEE: Consistency-Oriented Signal-Based Early Exiting via Calibrated
  Sample Weighting Mechanism
COSEE: Consistency-Oriented Signal-Based Early Exiting via Calibrated Sample Weighting Mechanism
Jianing He
Qi Zhang
Hongyun Zhang
Xuanjing Huang
Usman Naseem
Duoqian Miao
64
1
0
17 Dec 2024
MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language
  Models Fine-tuning
MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning
Jingfan Zhang
Yi Zhao
Dan Chen
Xing Tian
Huanran Zheng
Wei Zhu
MoE
37
12
0
23 Oct 2024
FiRST: Finetuning Router-Selective Transformers for Input-Adaptive
  Latency Reduction
FiRST: Finetuning Router-Selective Transformers for Input-Adaptive Latency Reduction
Akriti Jain
Saransh Sharma
Koyel Mukherjee
Soumyabrata Pal
31
1
0
16 Oct 2024
LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive
  Compression Strategy
LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy
Rongzhi Zhang
Kuang Wang
Liyuan Liu
Shuohang Wang
Hao Cheng
Chao Zhang
Yelong Shen
MQ
26
5
0
04 Oct 2024
An Efficient Inference Framework for Early-exit Large Language Models
An Efficient Inference Framework for Early-exit Large Language Models
Ruijie Miao
Yihan Yan
Xinshuo Yao
Tong Yang
27
0
0
25 Jul 2024
IAPT: Instruction-Aware Prompt Tuning for Large Language Models
IAPT: Instruction-Aware Prompt Tuning for Large Language Models
Wei-wei Zhu
Aaron Xuxiang Tian
Congrui Yin
Yuan Ni
Xiaoling Wang
Guotong Xie
48
0
0
28 May 2024
CEEBERT: Cross-Domain Inference in Early Exit BERT
CEEBERT: Cross-Domain Inference in Early Exit BERT
Divya J. Bajpai
M. Hanawal
LRM
45
4
0
23 May 2024
A Comprehensive Survey of Accelerated Generation Techniques in Large
  Language Models
A Comprehensive Survey of Accelerated Generation Techniques in Large Language Models
Mahsa Khoshnoodi
Vinija Jain
Mingye Gao
Malavika Srikanth
Aman Chadha
OffRL
33
1
0
15 May 2024
A Survey on Efficient Inference for Large Language Models
A Survey on Efficient Inference for Large Language Models
Zixuan Zhou
Xuefei Ning
Ke Hong
Tianyu Fu
Jiaming Xu
...
Shengen Yan
Guohao Dai
Xiao-Ping Zhang
Yuhan Dong
Yu-Xiang Wang
46
83
0
22 Apr 2024
FFN-SkipLLM: A Hidden Gem for Autoregressive Decoding with Adaptive Feed
  Forward Skipping
FFN-SkipLLM: A Hidden Gem for Autoregressive Decoding with Adaptive Feed Forward Skipping
Ajay Jaiswal
Bodun Hu
Lu Yin
Yeonju Ro
Shiwei Liu
Tianlong Chen
Aditya Akella
58
12
0
05 Apr 2024
ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language
  Models
ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models
Zequan Liu
Jiawen Lyn
Wei-wei Zhu
Xing Tian
Yvette Graham
OffRL
32
10
0
24 Mar 2024
Hierarchical Skip Decoding for Efficient Autoregressive Text Generation
Hierarchical Skip Decoding for Efficient Autoregressive Text Generation
Yunqi Zhu
Xuebing Yang
Yuanyuan Wu
Wensheng Zhang
31
3
0
22 Mar 2024
DE$^3$-BERT: Distance-Enhanced Early Exiting for BERT based on
  Prototypical Networks
DE3^33-BERT: Distance-Enhanced Early Exiting for BERT based on Prototypical Networks
Jianing He
Qi Zhang
Weiping Ding
Duoqian Miao
Jun Zhao
Liang Hu
LongBing Cao
38
3
0
03 Feb 2024
Towards Efficient Generative Large Language Model Serving: A Survey from
  Algorithms to Systems
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
Xupeng Miao
Gabriele Oliaro
Zhihao Zhang
Xinhao Cheng
Hongyi Jin
Tianqi Chen
Zhihao Jia
65
76
0
23 Dec 2023
ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for
  Accelerating Language Models Inference
ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for Accelerating Language Models Inference
Ziqian Zeng
Yihuai Hong
Hongliang Dai
Huiping Zhuang
Cen Chen
24
10
0
19 Dec 2023
SHARCS: Efficient Transformers through Routing with Dynamic Width
  Sub-networks
SHARCS: Efficient Transformers through Routing with Dynamic Width Sub-networks
Mohammadreza Salehi
Sachin Mehta
Aditya Kusupati
Ali Farhadi
Hannaneh Hajishirzi
28
5
0
18 Oct 2023
Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs
Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs
Suyu Ge
Yunan Zhang
Liyuan Liu
Minjia Zhang
Jiawei Han
Jianfeng Gao
4
216
0
03 Oct 2023
LGViT: Dynamic Early Exiting for Accelerating Vision Transformer
LGViT: Dynamic Early Exiting for Accelerating Vision Transformer
Guanyu Xu
Jiawei Hao
Li Shen
Han Hu
Yong Luo
Hui Lin
J. Shen
24
15
0
01 Aug 2023
Predictive Pipelined Decoding: A Compute-Latency Trade-off for Exact LLM
  Decoding
Predictive Pipelined Decoding: A Compute-Latency Trade-off for Exact LLM Decoding
Seongjun Yang
Gibbeum Lee
Jaewoong Cho
Dimitris Papailiopoulos
Kangwook Lee
23
33
0
12 Jul 2023
SkipDecode: Autoregressive Skip Decoding with Batching and Caching for
  Efficient LLM Inference
SkipDecode: Autoregressive Skip Decoding with Batching and Caching for Efficient LLM Inference
Luciano Del Corro
Allison Del Giorno
Sahaj Agarwal
Ting Yu
Ahmed Hassan Awadallah
Subhabrata Mukherjee
23
53
0
05 Jul 2023
The Benefits of Bad Advice: Autocontrastive Decoding across Model Layers
The Benefits of Bad Advice: Autocontrastive Decoding across Model Layers
Ariel Gera
Roni Friedman
Ofir Arviv
Chulaka Gunasekara
Benjamin Sznajder
Noam Slonim
Eyal Shnarch
43
19
0
02 May 2023
Candidate Soups: Fusing Candidate Results Improves Translation Quality
  for Non-Autoregressive Translation
Candidate Soups: Fusing Candidate Results Improves Translation Quality for Non-Autoregressive Translation
Huanran Zheng
Wei-wei Zhu
Pengfei Wang
Xiaoling Wang
17
8
0
27 Jan 2023
Confident Adaptive Language Modeling
Confident Adaptive Language Modeling
Tal Schuster
Adam Fisch
Jai Gupta
Mostafa Dehghani
Dara Bahri
Vinh Q. Tran
Yi Tay
Donald Metzler
43
160
0
14 Jul 2022
Continuous Detection, Rapidly React: Unseen Rumors Detection based on
  Continual Prompt-Tuning
Continuous Detection, Rapidly React: Unseen Rumors Detection based on Continual Prompt-Tuning
Yuhui Zuo
Wei Zhu
Guoyong Cai
CLL
VLM
14
11
0
16 Mar 2022
Paradigm Shift in Natural Language Processing
Paradigm Shift in Natural Language Processing
Tianxiang Sun
Xiangyang Liu
Xipeng Qiu
Xuanjing Huang
124
82
0
26 Sep 2021
CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language
  Understanding and Generation
CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation
Yunfan Shao
Zhichao Geng
Yitao Liu
Junqi Dai
Hang Yan
Fei Yang
Li Zhe
Hujun Bao
Xipeng Qiu
MedIm
70
147
0
13 Sep 2021
Pre-trained Models for Natural Language Processing: A Survey
Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu
Tianxiang Sun
Yige Xu
Yunfan Shao
Ning Dai
Xuanjing Huang
LM&MA
VLM
243
1,450
0
18 Mar 2020
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
Canwen Xu
Wangchunshu Zhou
Tao Ge
Furu Wei
Ming Zhou
221
197
0
07 Feb 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
233
576
0
12 Sep 2019
Teaching Machines to Read and Comprehend
Teaching Machines to Read and Comprehend
Karl Moritz Hermann
Tomás Kociský
Edward Grefenstette
L. Espeholt
W. Kay
Mustafa Suleyman
Phil Blunsom
175
3,510
0
10 Jun 2015
1