ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.05109
  4. Cited By
ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision
  Transformer Acceleration with a Linear Taylor Attention

ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with a Linear Taylor Attention

9 November 2022
Jyotikrishna Dass
Shang Wu
Huihong Shi
Chaojian Li
Zhifan Ye
Zhongfeng Wang
Yingyan Lin
ArXiv (abs)PDFHTML

Papers citing "ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with a Linear Taylor Attention"

22 / 22 papers shown
Title
Bishop: Sparsified Bundling Spiking Transformers on Heterogeneous Cores with Error-Constrained Pruning
Bishop: Sparsified Bundling Spiking Transformers on Heterogeneous Cores with Error-Constrained Pruning
Boxun Xu
Yuxuan Yin
Vikram Iyer
Peng Li
MoE
104
0
0
18 May 2025
AccLLM: Accelerating Long-Context LLM Inference Via Algorithm-Hardware Co-Design
AccLLM: Accelerating Long-Context LLM Inference Via Algorithm-Hardware Co-Design
Yanbiao Liang
Huihong Shi
Haikuo Shao
Zhongfeng Wang
97
0
0
07 Apr 2025
T-REX: A 68-567 μs/token, 0.41-3.95 μJ/token Transformer Accelerator with Reduced External Memory Access and Enhanced Hardware Utilization in 16nm FinFET
Seunghyun Moon
Mao Li
Gregory K. Chen
Phil Knag
R. Krishnamurthy
Mingoo Seok
74
0
0
01 Mar 2025
PAPI: Exploiting Dynamic Parallelism in Large Language Model Decoding with a Processing-In-Memory-Enabled Computing System
PAPI: Exploiting Dynamic Parallelism in Large Language Model Decoding with a Processing-In-Memory-Enabled Computing System
Yintao He
Haiyu Mao
Christina Giannoula
Mohammad Sadrosadati
Juan Gómez Luna
Huawei Li
Xiaowei Li
Ying Wang
O. Mutlu
91
8
0
21 Feb 2025
Chameleon: Adaptive Caching and Scheduling for Many-Adapter LLM
  Inference Environments
Chameleon: Adaptive Caching and Scheduling for Many-Adapter LLM Inference Environments
Nikoleta Iliakopoulou
Jovan Stojkovic
Chloe Alverti
Tianyin Xu
Hubertus Franke
Josep Torrellas
134
2
0
24 Nov 2024
Test-time adaptation for image compression with distribution
  regularization
Test-time adaptation for image compression with distribution regularization
Kecheng Chen
Pingping Zhang
Tiexin Qin
Shiqi Wang
Hong Yan
Haoliang Li
96
1
0
16 Oct 2024
M$^2$-ViT: Accelerating Hybrid Vision Transformers with Two-Level Mixed
  Quantization
M2^22-ViT: Accelerating Hybrid Vision Transformers with Two-Level Mixed Quantization
Yanbiao Liang
Huihong Shi
Zhongfeng Wang
MQ
71
0
0
10 Oct 2024
KnowFormer: Revisiting Transformers for Knowledge Graph Reasoning
KnowFormer: Revisiting Transformers for Knowledge Graph Reasoning
Junnan Liu
Qianren Mao
Weifeng Jiang
Jianxin Li
62
1
0
19 Sep 2024
P$^2$-ViT: Power-of-Two Post-Training Quantization and Acceleration for
  Fully Quantized Vision Transformer
P2^22-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer
Huihong Shi
Xin Cheng
Wendong Mao
Zhongfeng Wang
MQ
83
6
0
30 May 2024
Why Larger Language Models Do In-context Learning Differently?
Why Larger Language Models Do In-context Learning Differently?
Zhenmei Shi
Junyi Wei
Zhuoyan Xu
Yingyu Liang
81
26
0
30 May 2024
The CAP Principle for LLM Serving: A Survey of Long-Context Large
  Language Model Serving
The CAP Principle for LLM Serving: A Survey of Long-Context Large Language Model Serving
Pai Zeng
Zhenyu Ning
Jieru Zhao
Weihao Cui
Mengwei Xu
Liwei Guo
Xusheng Chen
Yizhou Shan
LLMAG
95
4
0
18 May 2024
Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free
  Efficient Vision Transformer
Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer
Huihong Shi
Haikuo Shao
Wendong Mao
Zhongfeng Wang
ViTMQ
74
3
0
06 May 2024
Test-Time Model Adaptation with Only Forward Passes
Test-Time Model Adaptation with Only Forward Passes
Shuaicheng Niu
Chunyan Miao
Guohao Chen
Pengcheng Wu
Peilin Zhao
TTA
132
26
0
02 Apr 2024
An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer
  Hybrid EfficientViT
An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT
Haikuo Shao
Huihong Shi
Wendong Mao
Zhongfeng Wang
74
5
0
29 Mar 2024
ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV
  Caching
ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching
Youpeng Zhao
Di Wu
Jun Wang
96
28
0
26 Mar 2024
TaylorShift: Shifting the Complexity of Self-Attention from Squared to
  Linear (and Back) using Taylor-Softmax
TaylorShift: Shifting the Complexity of Self-Attention from Squared to Linear (and Back) using Taylor-Softmax
Tobias Christian Nauen
Sebastián M. Palacio
Andreas Dengel
110
3
0
05 Mar 2024
HyperAttention: Long-context Attention in Near-Linear Time
HyperAttention: Long-context Attention in Near-Linear Time
Insu Han
Rajesh Jayaram
Amin Karbasi
Vahab Mirrokni
David P. Woodruff
A. Zandieh
116
74
0
09 Oct 2023
Compressing Vision Transformers for Low-Resource Visual Learning
Compressing Vision Transformers for Low-Resource Visual Learning
Eric Youn
J. SaiMitheran
Sanjana Prabhu
Siyuan Chen
ViT
66
2
0
05 Sep 2023
SqueezeLLM: Dense-and-Sparse Quantization
SqueezeLLM: Dense-and-Sparse Quantization
Sehoon Kim
Coleman Hooper
A. Gholami
Zhen Dong
Xiuyu Li
Sheng Shen
Michael W. Mahoney
Kurt Keutzer
MQ
152
198
0
13 Jun 2023
KDEformer: Accelerating Transformers via Kernel Density Estimation
KDEformer: Accelerating Transformers via Kernel Density Estimation
A. Zandieh
Insu Han
Majid Daliri
Amin Karbasi
125
47
0
05 Feb 2023
Deep representation learning: Fundamentals, Perspectives, Applications,
  and Open Challenges
Deep representation learning: Fundamentals, Perspectives, Applications, and Open Challenges
K. T. Baghaei
Amirreza Payandeh
Pooya Fayyazsanavi
Shahram Rahimi
Zhiqian Chen
Somayeh Bakhtiari Ramezani
FaMLAI4TS
69
6
0
27 Nov 2022
Castling-ViT: Compressing Self-Attention via Switching Towards
  Linear-Angular Attention at Vision Transformer Inference
Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention at Vision Transformer Inference
Haoran You
Yunyang Xiong
Xiaoliang Dai
Bichen Wu
Peizhao Zhang
Haoqi Fan
Peter Vajda
Yingyan Lin
159
34
0
18 Nov 2022
1