ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.16269
  4. Cited By
COBRA: Algorithm-Architecture Co-optimized Binary Transformer Accelerator for Edge Inference
v1v2 (latest)

COBRA: Algorithm-Architecture Co-optimized Binary Transformer Accelerator for Edge Inference

22 April 2025
Ye Qiao
Zhiheng Cheng
Yian Wang
Yifan Zhang
Yunzhe Deng
Sitao Huang
ArXiv (abs)PDFHTML

Papers citing "COBRA: Algorithm-Architecture Co-optimized Binary Transformer Accelerator for Edge Inference"

19 / 19 papers shown
Title
Co-Designing Binarized Transformer and Hardware Accelerator for
  Efficient End-to-End Edge Deployment
Co-Designing Binarized Transformer and Hardware Accelerator for Efficient End-to-End Edge Deployment
Yuhao Ji
Chao Fang
Shaobo Ma
Haikuo Shao
Zhongfeng Wang
MQ
72
1
0
16 Jul 2024
BETA: Binarized Energy-Efficient Transformer Accelerator at the Edge
BETA: Binarized Energy-Efficient Transformer Accelerator at the Edge
Yuhao Ji
Chao Fang
Zhongfeng Wang
57
3
0
22 Jan 2024
Support for Stock Trend Prediction Using Transformers and Sentiment
  Analysis
Support for Stock Trend Prediction Using Transformers and Sentiment Analysis
Harsimrat Kaeley
Ye Qiao
N. Bagherzadeh
AIFinAI4TS
35
11
0
18 May 2023
ProgPrompt: Generating Situated Robot Task Plans using Large Language
  Models
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models
Ishika Singh
Valts Blukis
Arsalan Mousavian
Ankit Goyal
Danfei Xu
Jonathan Tremblay
Dieter Fox
Jesse Thomason
Animesh Garg
LM&RoLLMAG
177
657
0
22 Sep 2022
A Two-Stage Efficient 3-D CNN Framework for EEG Based Emotion
  Recognition
A Two-Stage Efficient 3-D CNN Framework for EEG Based Emotion Recognition
Ye Qiao
Mohammed Alnemari
N. Bagherzadeh
MQ
29
7
0
26 Jul 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with
  IO-Awareness
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
258
2,285
0
27 May 2022
BiT: Robustly Binarized Multi-distilled Transformer
BiT: Robustly Binarized Multi-distilled Transformer
Zechun Liu
Barlas Oğuz
Aasish Pappu
Lin Xiao
Scott Yih
Meng Li
Raghuraman Krishnamoorthi
Yashar Mehdad
MQ
107
55
0
25 May 2022
A Fast Post-Training Pruning Framework for Transformers
A Fast Post-Training Pruning Framework for Transformers
Woosuk Kwon
Sehoon Kim
Michael W. Mahoney
Joseph Hassoun
Kurt Keutzer
A. Gholami
91
154
0
29 Mar 2022
BiBERT: Accurate Fully Binarized BERT
BiBERT: Accurate Fully Binarized BERT
Haotong Qin
Yifu Ding
Mingyuan Zhang
Qing Yan
Aishan Liu
Qingqing Dang
Ziwei Liu
Xianglong Liu
MQ
60
95
0
12 Mar 2022
VAQF: Fully Automatic Software-Hardware Co-Design Framework for Low-Bit
  Vision Transformer
VAQF: Fully Automatic Software-Hardware Co-Design Framework for Low-Bit Vision Transformer
Mengshu Sun
Haoyu Ma
Guoliang Kang
Yi Ding
Tianlong Chen
Xiaolong Ma
Zhangyang Wang
Yanzhi Wang
ViT
88
46
0
17 Jan 2022
Hardware Acceleration of Fully Quantized BERT for Efficient Natural
  Language Processing
Hardware Acceleration of Fully Quantized BERT for Efficient Natural Language Processing
Zejian Liu
Gang Li
Jian Cheng
MQ
46
61
0
04 Mar 2021
BinaryBERT: Pushing the Limit of BERT Quantization
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
219
227
0
31 Dec 2020
FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with
  Fractional Activations
FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations
Yichi Zhang
Junhao Pan
Xinheng Liu
Hongzheng Chen
Deming Chen
Zhiru Zhang
MQ
97
93
0
22 Dec 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
684
41,563
0
22 Oct 2020
TernaryBERT: Distillation-aware Ultra-low Bit BERT
TernaryBERT: Distillation-aware Ultra-low Bit BERT
Wei Zhang
Lu Hou
Yichun Yin
Lifeng Shang
Xiao Chen
Xin Jiang
Qun Liu
MQ
93
211
0
27 Sep 2020
Q8BERT: Quantized 8Bit BERT
Q8BERT: Quantized 8Bit BERT
Ofir Zafrir
Guy Boudoukh
Peter Izsak
Moshe Wasserblat
MQ
93
506
0
14 Oct 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
95,324
0
11 Oct 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,201
0
20 Apr 2018
XNORBIN: A 95 TOp/s/W Hardware Accelerator for Binary Convolutional
  Neural Networks
XNORBIN: A 95 TOp/s/W Hardware Accelerator for Binary Convolutional Neural Networks
A. Bahou
G. Karunaratne
Renzo Andri
Lukas Cavigelli
Luca Benini
MQ
44
45
0
05 Mar 2018
1