Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2504.16269
Cited By
v1
v2 (latest)
COBRA: Algorithm-Architecture Co-optimized Binary Transformer Accelerator for Edge Inference
22 April 2025
Ye Qiao
Zhiheng Cheng
Yian Wang
Yifan Zhang
Yunzhe Deng
Sitao Huang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"COBRA: Algorithm-Architecture Co-optimized Binary Transformer Accelerator for Edge Inference"
19 / 19 papers shown
Title
Co-Designing Binarized Transformer and Hardware Accelerator for Efficient End-to-End Edge Deployment
Yuhao Ji
Chao Fang
Shaobo Ma
Haikuo Shao
Zhongfeng Wang
MQ
72
1
0
16 Jul 2024
BETA: Binarized Energy-Efficient Transformer Accelerator at the Edge
Yuhao Ji
Chao Fang
Zhongfeng Wang
57
3
0
22 Jan 2024
Support for Stock Trend Prediction Using Transformers and Sentiment Analysis
Harsimrat Kaeley
Ye Qiao
N. Bagherzadeh
AIFin
AI4TS
35
11
0
18 May 2023
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models
Ishika Singh
Valts Blukis
Arsalan Mousavian
Ankit Goyal
Danfei Xu
Jonathan Tremblay
Dieter Fox
Jesse Thomason
Animesh Garg
LM&Ro
LLMAG
177
657
0
22 Sep 2022
A Two-Stage Efficient 3-D CNN Framework for EEG Based Emotion Recognition
Ye Qiao
Mohammed Alnemari
N. Bagherzadeh
MQ
29
7
0
26 Jul 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
258
2,285
0
27 May 2022
BiT: Robustly Binarized Multi-distilled Transformer
Zechun Liu
Barlas Oğuz
Aasish Pappu
Lin Xiao
Scott Yih
Meng Li
Raghuraman Krishnamoorthi
Yashar Mehdad
MQ
107
55
0
25 May 2022
A Fast Post-Training Pruning Framework for Transformers
Woosuk Kwon
Sehoon Kim
Michael W. Mahoney
Joseph Hassoun
Kurt Keutzer
A. Gholami
91
154
0
29 Mar 2022
BiBERT: Accurate Fully Binarized BERT
Haotong Qin
Yifu Ding
Mingyuan Zhang
Qing Yan
Aishan Liu
Qingqing Dang
Ziwei Liu
Xianglong Liu
MQ
60
95
0
12 Mar 2022
VAQF: Fully Automatic Software-Hardware Co-Design Framework for Low-Bit Vision Transformer
Mengshu Sun
Haoyu Ma
Guoliang Kang
Yi Ding
Tianlong Chen
Xiaolong Ma
Zhangyang Wang
Yanzhi Wang
ViT
88
46
0
17 Jan 2022
Hardware Acceleration of Fully Quantized BERT for Efficient Natural Language Processing
Zejian Liu
Gang Li
Jian Cheng
MQ
46
61
0
04 Mar 2021
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
219
227
0
31 Dec 2020
FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations
Yichi Zhang
Junhao Pan
Xinheng Liu
Hongzheng Chen
Deming Chen
Zhiru Zhang
MQ
97
93
0
22 Dec 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
684
41,563
0
22 Oct 2020
TernaryBERT: Distillation-aware Ultra-low Bit BERT
Wei Zhang
Lu Hou
Yichun Yin
Lifeng Shang
Xiao Chen
Xin Jiang
Qun Liu
MQ
93
211
0
27 Sep 2020
Q8BERT: Quantized 8Bit BERT
Ofir Zafrir
Guy Boudoukh
Peter Izsak
Moshe Wasserblat
MQ
93
506
0
14 Oct 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,324
0
11 Oct 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,201
0
20 Apr 2018
XNORBIN: A 95 TOp/s/W Hardware Accelerator for Binary Convolutional Neural Networks
A. Bahou
G. Karunaratne
Renzo Andri
Lukas Cavigelli
Luca Benini
MQ
44
45
0
05 Mar 2018
1