Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.14187
Cited By
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
28 May 2020
Hanrui Wang
Zhanghao Wu
Zhijian Liu
Han Cai
Ligeng Zhu
Chuang Gan
Song Han
Re-assign community
ArXiv
PDF
HTML
Papers citing
"HAT: Hardware-Aware Transformers for Efficient Natural Language Processing"
50 / 67 papers shown
Title
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
45
0
0
11 Feb 2025
Merino: Entropy-driven Design for Generative Language Models on IoT Devices
Youpeng Zhao
Ming Lin
Huadong Tang
Qiang Wu
Jun Wang
83
0
0
28 Jan 2025
Efficiently Distilling LLMs for Edge Applications
Achintya Kundu
Fabian Lim
Aaron Chew
L. Wynter
Penny Chong
Rhui Dih Lee
42
6
0
01 Apr 2024
Multi-objective Differentiable Neural Architecture Search
R. Sukthanker
Arber Zela
B. Staffler
Samuel Dooley
Josif Grabocka
Frank Hutter
47
1
0
28 Feb 2024
TransAxx: Efficient Transformers with Approximate Computing
Dimitrios Danopoulos
Georgios Zervakis
Dimitrios Soudris
Jörg Henkel
ViT
42
2
0
12 Feb 2024
DistDNAS: Search Efficient Feature Interactions within 2 Hours
Tunhou Zhang
W. Wen
Igor Fedorov
Xi Liu
Buyun Zhang
...
Wen-Yen Chen
Yiping Han
Feng Yan
Hai Helen Li
Yiran Chen
21
1
0
01 Nov 2023
Evolutionary Neural Architecture Search for Transformer in Knowledge Tracing
Shangshang Yang
Xiaoshan Yu
Ye Tian
Xueming Yan
Haiping Ma
Xingyi Zhang
ViT
KELM
AI4Ed
21
2
0
02 Oct 2023
InstaTune: Instantaneous Neural Architecture Search During Fine-Tuning
S. N. Sridhar
Souvik Kundu
Sairam Sundaresan
Maciej Szankin
Anthony Sarah
25
3
0
29 Aug 2023
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers
Tobias Christian Nauen
Sebastián M. Palacio
Federico Raue
Andreas Dengel
42
3
0
18 Aug 2023
Open-TransMind: A New Baseline and Benchmark for 1st Foundation Model Challenge of Intelligent Transportation
Yifeng Shi
Feng Lv
Xinliang Wang
Chunlong Xia
Shaojie Li
Shu-Zhen Yang
Teng Xi
Gang Zhang
VLM
43
13
0
12 Apr 2023
SwiftTron: An Efficient Hardware Accelerator for Quantized Transformers
Alberto Marchisio
David Durà
Maurizio Capra
Maurizio Martina
Guido Masera
Muhammad Shafique
36
19
0
08 Apr 2023
System-status-aware Adaptive Network for Online Streaming Video Understanding
Lin Geng Foo
Jia Gong
Zhipeng Fan
Xiaozhong Liu
AI4TS
32
15
0
28 Mar 2023
EdgeTran: Co-designing Transformers for Efficient Inference on Mobile Edge Platforms
Shikhar Tuli
N. Jha
36
3
0
24 Mar 2023
DetOFA: Efficient Training of Once-for-All Networks for Object Detection Using Path Filter
Yuiko Sakuma
Masato Ishii
T. Narihira
36
2
0
23 Mar 2023
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Junaid Qadir
42
47
0
21 Mar 2023
Gradient-Free Structured Pruning with Unlabeled Data
Azade Nova
H. Dai
Dale Schuurmans
SyDa
40
20
0
07 Mar 2023
AccelTran: A Sparsity-Aware Accelerator for Dynamic Inference with Transformers
Shikhar Tuli
N. Jha
33
31
0
28 Feb 2023
Full Stack Optimization of Transformer Inference: a Survey
Sehoon Kim
Coleman Hooper
Thanakul Wattanawong
Minwoo Kang
Ruohan Yan
...
Qijing Huang
Kurt Keutzer
Michael W. Mahoney
Y. Shao
A. Gholami
MQ
36
101
0
27 Feb 2023
The Framework Tax: Disparities Between Inference Efficiency in NLP Research and Deployment
Jared Fernandez
Jacob Kahn
Clara Na
Yonatan Bisk
Emma Strubell
FedML
33
10
0
13 Feb 2023
6-DoF Robotic Grasping with Transformer
Zhenjie Zhao
Han Yu
Hang Wu
Xuebo Zhang
ViT
36
0
0
29 Jan 2023
Convolution-enhanced Evolving Attention Networks
Yujing Wang
Yaming Yang
Zhuowan Li
Jiangang Bai
Mingliang Zhang
Xiangtai Li
Jiahao Yu
Ce Zhang
Gao Huang
Yu Tong
ViT
27
6
0
16 Dec 2022
Vision Transformer Computation and Resilience for Dynamic Inference
Kavya Sreedhar
Jason Clemons
Rangharajan Venkatesan
S. Keckler
M. Horowitz
26
2
0
06 Dec 2022
HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers
Peiyan Dong
Mengshu Sun
Alec Lu
Yanyue Xie
Li-Yu Daisy Liu
...
Xin Meng
Zechao Li
Xue Lin
Zhenman Fang
Yanzhi Wang
ViT
34
59
0
15 Nov 2022
Efficiently Scaling Transformer Inference
Reiner Pope
Sholto Douglas
Aakanksha Chowdhery
Jacob Devlin
James Bradbury
Anselm Levskaya
Jonathan Heek
Kefan Xiao
Shivani Agrawal
J. Dean
34
295
0
09 Nov 2022
QuaLA-MiniLM: a Quantized Length Adaptive MiniLM
Shira Guskin
Moshe Wasserblat
Chang Wang
Haihao Shen
MQ
16
2
0
31 Oct 2022
NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks
Huihong Shi
Haoran You
Yang Katie Zhao
Zhongfeng Wang
Yingyan Lin
64
7
0
24 Oct 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
30
109
0
31 Aug 2022
Efficient Sparsely Activated Transformers
Salar Latifi
Saurav Muralidharan
M. Garland
MoE
21
2
0
31 Aug 2022
Neural Architecture Search on Efficient Transformers and Beyond
Zexiang Liu
Dong Li
Kaiyue Lu
Zhen Qin
Weixuan Sun
Jiacheng Xu
Yiran Zhong
35
19
0
28 Jul 2022
UFO: Unified Feature Optimization
Teng Xi
Yifan Sun
Deli Yu
Bi Li
Nan Peng
...
Haocheng Feng
Junyu Han
Jingtuo Liu
Errui Ding
Jingdong Wang
32
10
0
21 Jul 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
34
32
0
19 Jun 2022
EfficientFormer: Vision Transformers at MobileNet Speed
Yanyu Li
Geng Yuan
Yang Wen
Eric Hu
Georgios Evangelidis
Sergey Tulyakov
Yanzhi Wang
Jian Ren
ViT
23
347
0
02 Jun 2022
A Hardware-Aware Framework for Accelerating Neural Architecture Search Across Modalities
Daniel Cummings
Anthony Sarah
S. N. Sridhar
Maciej Szankin
J. P. Muñoz
Sairam Sundaresan
30
8
0
19 May 2022
PVNAS: 3D Neural Architecture Search with Point-Voxel Convolution
Zhijian Liu
Haotian Tang
Shengyu Zhao
Kevin Shao
Song Han
3DPC
21
40
0
25 Apr 2022
SplitNets: Designing Neural Architectures for Efficient Distributed Computing on Head-Mounted Systems
Xin Dong
B. D. Salvo
Meng Li
Chiao Liu
Zhongnan Qu
H. T. Kung
Ziyun Li
3DGS
26
20
0
10 Apr 2022
Training-free Transformer Architecture Search
Qinqin Zhou
Kekai Sheng
Xiawu Zheng
Ke Li
Xing Sun
Yonghong Tian
Jie Chen
Rongrong Ji
ViT
40
46
0
23 Mar 2022
Accelerating Neural Architecture Exploration Across Modalities Using Genetic Algorithms
Daniel Cummings
S. N. Sridhar
Anthony Sarah
Maciej Szankin
AI4CE
25
0
0
25 Feb 2022
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation
Tao Ge
Si-Qing Chen
Furu Wei
MoE
32
21
0
16 Feb 2022
Fast Monte-Carlo Approximation of the Attention Mechanism
Hyunjun Kim
Jeonggil Ko
17
2
0
30 Jan 2022
Representing Long-Range Context for Graph Neural Networks with Global Attention
Zhanghao Wu
Paras Jain
Matthew A. Wright
Azalia Mirhoseini
Joseph E. Gonzalez
Ion Stoica
GNN
46
258
0
21 Jan 2022
Searching for TrioNet: Combining Convolution with Local and Global Self-Attention
Huaijin Pi
Huiyu Wang
Yingwei Li
Zizhang Li
Alan Yuille
ViT
27
3
0
15 Nov 2021
One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search
Bingqian Lu
Jianyi Yang
Weiwen Jiang
Yiyu Shi
Shaolei Ren
24
24
0
01 Nov 2021
Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization
Panjie Qi
E. Sha
Qingfeng Zhuge
Hongwu Peng
Shaoyi Huang
Zhenglun Kong
Yuhong Song
Bingbing Li
11
49
0
19 Oct 2021
Understanding and Overcoming the Challenges of Efficient Transformer Quantization
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
25
133
0
27 Sep 2021
The NiuTrans System for the WMT21 Efficiency Task
Chenglong Wang
Chi Hu
Yongyu Mu
Zhongxiang Yan
Siming Wu
...
Hang Cao
Bei Li
Ye Lin
Tong Xiao
Jingbo Zhu
29
2
0
16 Sep 2021
EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation
Chenhe Dong
Guangrun Wang
Hang Xu
Jiefeng Peng
Xiaozhe Ren
Xiaodan Liang
24
28
0
15 Sep 2021
Group Fisher Pruning for Practical Network Compression
Liyang Liu
Shilong Zhang
Zhanghui Kuang
Aojun Zhou
Jingliang Xue
Xinjiang Wang
Yimin Chen
Wenming Yang
Q. Liao
Wayne Zhang
25
146
0
02 Aug 2021
QuantumNAS: Noise-Adaptive Search for Robust Quantum Circuits
Hanrui Wang
Yongshan Ding
Jiaqi Gu
Zirui Li
Yujun Lin
David Z. Pan
Frederic T. Chong
Song Han
33
170
0
22 Jul 2021
AutoFormer: Searching Transformers for Visual Recognition
Minghao Chen
Houwen Peng
Jianlong Fu
Haibin Ling
ViT
36
259
0
01 Jul 2021
HELP: Hardware-Adaptive Efficient Latency Prediction for NAS via Meta-Learning
Hayeon Lee
Sewoong Lee
Song Chong
Sung Ju Hwang
21
26
0
16 Jun 2021
1
2
Next