ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2101.11605
  4. Cited By
Bottleneck Transformers for Visual Recognition

Bottleneck Transformers for Visual Recognition

27 January 2021
A. Srinivas
Tsung-Yi Lin
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
    SLR
ArXivPDFHTML

Papers citing "Bottleneck Transformers for Visual Recognition"

50 / 341 papers shown
Title
LOOPE: Learnable Optimal Patch Order in Positional Embeddings for Vision Transformers
LOOPE: Learnable Optimal Patch Order in Positional Embeddings for Vision Transformers
M. Chowdhury
Md Rifat Ur Rahman
Akil Ahmad Taki
25
0
0
19 Apr 2025
PolygoNet: Leveraging Simplified Polygonal Representation for Effective Image Classification
PolygoNet: Leveraging Simplified Polygonal Representation for Effective Image Classification
Salim Khazem
Jérémy Fix
C´edric Pradalier
36
0
0
01 Apr 2025
vGamba: Attentive State Space Bottleneck for efficient Long-range Dependencies in Visual Recognition
vGamba: Attentive State Space Bottleneck for efficient Long-range Dependencies in Visual Recognition
Yunusa Haruna
A. Lawan
Mamba
52
0
0
27 Mar 2025
Vim-F: Visual State Space Model Benefiting from Learning in the Frequency Domain
Vim-F: Visual State Space Model Benefiting from Learning in the Frequency Domain
Juntao Zhang
Kun Bian
Peng Cheng
You Zhou
Jianning Liu
Wenbo An
Jun Zhou
Kun Shao
Mamba
47
2
0
08 Jan 2025
VMamba: Visual State Space Model
VMamba: Visual State Space Model
Yue Liu
Yunjie Tian
Yuzhong Zhao
Hongtian Yu
Lingxi Xie
Yaowei Wang
Qixiang Ye
Jianbin Jiao
Yunfan Liu
Mamba
144
611
0
31 Dec 2024
A4-Unet: Deformable Multi-Scale Attention Network for Brain Tumor
  Segmentation
A4-Unet: Deformable Multi-Scale Attention Network for Brain Tumor Segmentation
Ruoxin Wang
Tianyi Tang
Haiming Du
Yuxuan Cheng
Yu Wang
Lingjie Yang
Xiaohui Duan
Yunfang Yu
Yu Zhou
Donglong Chen
54
0
0
08 Dec 2024
3D Spine Shape Estimation from Single 2D DXA
3D Spine Shape Estimation from Single 2D DXA
Emmanuelle Bourigault
A. Jamaludin
Andrew Zisserman
61
0
0
02 Dec 2024
Scaling Spike-driven Transformer with Efficient Spike Firing
  Approximation Training
Scaling Spike-driven Transformer with Efficient Spike Firing Approximation Training
Man Yao
Xuerui Qiu
Tianxiang Hu
J. Hu
Yuhong Chou
Keyu Tian
Jianxing Liao
Luziwei Leng
Bo Xu
Guoqi Li
74
4
0
25 Nov 2024
SAG-ViT: A Scale-Aware, High-Fidelity Patching Approach with Graph Attention for Vision Transformers
SAG-ViT: A Scale-Aware, High-Fidelity Patching Approach with Graph Attention for Vision Transformers
Shravan Venkatraman
Jaskaran Singh Walia
J. Raheja
ViT
31
0
0
14 Nov 2024
BA-Net: Bridge Attention in Deep Neural Networks
BA-Net: Bridge Attention in Deep Neural Networks
Ronghui Zhang
Runzong Zou
Yue Zhao
Zirui Zhang
Junzhou Chen
Yue Cao
Chuan Hu
Houbing Song
31
0
0
10 Oct 2024
QuadMamba: Learning Quadtree-based Selective Scan for Visual State Space
  Model
QuadMamba: Learning Quadtree-based Selective Scan for Visual State Space Model
Fei Xie
Weijia Zhang
Zhongdao Wang
Chao Ma
Mamba
24
3
0
09 Oct 2024
TBConvL-Net: A Hybrid Deep Learning Architecture for Robust Medical
  Image Segmentation
TBConvL-Net: A Hybrid Deep Learning Architecture for Robust Medical Image Segmentation
Shahzaib Iqbal
Tariq M. Khan
Syed S. Naqvi
Asim Naveed
Erik H. W. Meijering
MedIm
48
6
0
05 Sep 2024
Leveraging GNSS and Onboard Visual Data from Consumer Vehicles for
  Robust Road Network Estimation
Leveraging GNSS and Onboard Visual Data from Consumer Vehicles for Robust Road Network Estimation
Balázs Opra
Betty Le Dem
Jeffrey M. Walls
D. Lukarski
C. Stachniss
19
0
0
03 Aug 2024
UNQA: Unified No-Reference Quality Assessment for Audio, Image, Video,
  and Audio-Visual Content
UNQA: Unified No-Reference Quality Assessment for Audio, Image, Video, and Audio-Visual Content
Y. Cao
Xiongkuo Min
Yixuan Gao
Wei Sun
Weisi Lin
Guangtao Zhai
44
2
0
29 Jul 2024
Depth-Wise Convolutions in Vision Transformers for Efficient Training on
  Small Datasets
Depth-Wise Convolutions in Vision Transformers for Efficient Training on Small Datasets
Tianxiao Zhang
Wenju Xu
Bo Luo
Guanghui Wang
ViT
MDE
36
7
0
28 Jul 2024
Real Time American Sign Language Detection Using Yolo-v9
Real Time American Sign Language Detection Using Yolo-v9
Amna Imran
Meghana Shashishekhara Hulikal
Hamza A. A. Gardi
ObjD
31
2
0
25 Jul 2024
Double-Shot 3D Shape Measurement with a Dual-Branch Network
Double-Shot 3D Shape Measurement with a Dual-Branch Network
Mingyang Lei
Jingfan Fan
Long Shao
Hong Song
Deqiang Xiao
Danni Ai
Tianyu Fu
Ying Gu
Jian Yang
3DPC
3DV
23
0
0
19 Jul 2024
GroupMamba: Efficient Group-Based Visual State Space Model
GroupMamba: Efficient Group-Based Visual State Space Model
Abdelrahman M. Shaker
Syed Talal Wasim
Salman Khan
Juergen Gall
Fahad Shahbaz Khan
Mamba
51
0
0
18 Jul 2024
iiANET: Inception Inspired Attention Hybrid Network for efficient Long-Range Dependency
iiANET: Inception Inspired Attention Hybrid Network for efficient Long-Range Dependency
Haruna Yunusa
Qin Shiyin
Abdulrahman Hamman Adama Chukkol
Isah Bello
A. Lawan
Isah Bello
39
4
0
10 Jul 2024
LOGCAN++: Adaptive Local-global class-aware network for semantic segmentation of remote sensing imagery
LOGCAN++: Adaptive Local-global class-aware network for semantic segmentation of remote sensing imagery
Xiaowen Ma
Rongrong Lian
Zhenkai Wu
Hongbo Guo
Mengting Ma
Sensen Wu
Zhenhong Du
Siyang Song
Wei Zhang
39
4
0
24 Jun 2024
Scale-Translation Equivariant Network for Oceanic Internal Solitary Wave
  Localization
Scale-Translation Equivariant Network for Oceanic Internal Solitary Wave Localization
Zhang Wan
Shuo Wang
Xudong Zhang
36
0
0
18 Jun 2024
MTS-Net: Dual-Enhanced Positional Multi-Head Self-Attention for 3D CT
  Diagnosis of May-Thurner Syndrome
MTS-Net: Dual-Enhanced Positional Multi-Head Self-Attention for 3D CT Diagnosis of May-Thurner Syndrome
Yixin Huang
Yiqi Jin
Ke Tao
Kaijian Xia
Jianfeng Gu
Lei Yu
Lan Du
Cunjian Chen
30
0
0
07 Jun 2024
Building Vision Models upon Heat Conduction
Building Vision Models upon Heat Conduction
Zhaozhi Wang
Yue Liu
Yunfan Liu
Hongtian Yu
Yaowei Wang
QiXiang Ye
ViT
VLM
52
0
0
26 May 2024
Multi-View Attentive Contextualization for Multi-View 3D Object
  Detection
Multi-View Attentive Contextualization for Multi-View 3D Object Detection
Xianpeng Liu
Ce Zheng
Ming Qian
Nan Xue
C. L. P. Chen
Zhebin Zhang
Chen Li
Tianfu Wu
31
2
0
20 May 2024
LeMeViT: Efficient Vision Transformer with Learnable Meta Tokens for
  Remote Sensing Image Interpretation
LeMeViT: Efficient Vision Transformer with Learnable Meta Tokens for Remote Sensing Image Interpretation
Wentao Jiang
Jing Zhang
Di Wang
Qiming Zhang
Zengmao Wang
Bo Du
29
5
0
16 May 2024
Cross-Temporal Spectrogram Autoencoder (CTSAE): Unsupervised
  Dimensionality Reduction for Clustering Gravitational Wave Glitches
Cross-Temporal Spectrogram Autoencoder (CTSAE): Unsupervised Dimensionality Reduction for Clustering Gravitational Wave Glitches
Yi Li
Yunan Wu
Aggelos K. Katsaggelos
19
1
0
23 Apr 2024
Data-independent Module-aware Pruning for Hierarchical Vision
  Transformers
Data-independent Module-aware Pruning for Hierarchical Vision Transformers
Yang He
Joey Tianyi Zhou
ViT
42
3
0
21 Apr 2024
Lightweight Deep Learning for Resource-Constrained Environments: A
  Survey
Lightweight Deep Learning for Resource-Constrained Environments: A Survey
Hou-I Liu
Marco Galindo
Hongxia Xie
Lai-Kuan Wong
Hong-Han Shuai
Yung-Hui Li
Wen-Huang Cheng
50
48
0
08 Apr 2024
ViTamin: Designing Scalable Vision Models in the Vision-Language Era
ViTamin: Designing Scalable Vision Models in the Vision-Language Era
Jienneg Chen
Qihang Yu
Xiaohui Shen
Alan L. Yuille
Liang-Chieh Chen
3DV
VLM
28
24
0
02 Apr 2024
Accurate Block Quantization in LLMs with Outliers
Accurate Block Quantization in LLMs with Outliers
Nikita Trukhanov
I. Soloveychik
MQ
24
3
0
29 Mar 2024
Heracles: A Hybrid SSM-Transformer Model for High-Resolution Image and
  Time-Series Analysis
Heracles: A Hybrid SSM-Transformer Model for High-Resolution Image and Time-Series Analysis
Badri N. Patro
Suhas Ranganath
Vinay P. Namboodiri
Vijay Srinivas Agneeswaran
43
2
0
26 Mar 2024
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
Chenhongyi Yang
Zehui Chen
Miguel Espinosa
Linus Ericsson
Zhenyu Wang
Jiaming Liu
Elliot J. Crowley
Mamba
26
86
0
26 Mar 2024
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
Ting Yao
Yehao Li
Yingwei Pan
Tao Mei
ViT
23
15
0
18 Mar 2024
LSKNet: A Foundation Lightweight Backbone for Remote Sensing
LSKNet: A Foundation Lightweight Backbone for Remote Sensing
Yuxuan Li
Xiang Li
Yimain Dai
Qibin Hou
Li Liu
Yongxiang Liu
Ming-Ming Cheng
Jian Yang
34
31
0
18 Mar 2024
depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning
  Researchers
depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers
Kaichao You
Runsheng Bai
Meng Cao
Jianmin Wang
Ion Stoica
Mingsheng Long
VLM
33
0
0
14 Mar 2024
Smartphone region-wise image indoor localization using deep learning for
  indoor tourist attraction
Smartphone region-wise image indoor localization using deep learning for indoor tourist attraction
G. Higa
Rodrigo Stuqui Monzani
Jorge Fernando da Silva Cecatto
Maria Fernanda Balestieri Mariano de Souza
V. A. Weber
H. Pistori
E. Matsubara
HAI
26
2
0
12 Mar 2024
SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion
  Classification Using 3D Multi-Phase Imaging
SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion Classification Using 3D Multi-Phase Imaging
Meng Lou
Hanning Ying
Xiaoqing Liu
Hong-Yu Zhou
Yuqing Zhang
Yizhou Yu
MedIm
39
8
0
27 Feb 2024
Exploring the Synergies of Hybrid CNNs and ViTs Architectures for
  Computer Vision: A survey
Exploring the Synergies of Hybrid CNNs and ViTs Architectures for Computer Vision: A survey
Haruna Yunusa
Shiyin Qin
Abdulrahman Hamman Adama Chukkol
Abdulganiyu Abdu Yusuf
Isah Bello
A. Lawan
ViT
30
13
0
05 Feb 2024
A Cost-Efficient FPGA Implementation of Tiny Transformer Model using
  Neural ODE
A Cost-Efficient FPGA Implementation of Tiny Transformer Model using Neural ODE
Ikumi Okubo
Keisuke Sugiura
Hiroki Matsutani
28
2
0
05 Jan 2024
BA-SAM: Scalable Bias-Mode Attention Mask for Segment Anything Model
BA-SAM: Scalable Bias-Mode Attention Mask for Segment Anything Model
Yiran Song
Qianyu Zhou
Xiangtai Li
Deng-Ping Fan
Xuequan Lu
Lizhuang Ma
VLM
30
14
0
04 Jan 2024
A Two-stream Hybrid CNN-Transformer Network for Skeleton-based Human
  Interaction Recognition
A Two-stream Hybrid CNN-Transformer Network for Skeleton-based Human Interaction Recognition
Ruoqi Yin
Jianqin Yin
ViT
32
4
0
31 Dec 2023
MetaSegNet: Metadata-collaborative Vision-Language Representation
  Learning for Semantic Segmentation of Remote Sensing Images
MetaSegNet: Metadata-collaborative Vision-Language Representation Learning for Semantic Segmentation of Remote Sensing Images
Libo Wang
Sijun Dong
Ying Chen
Xiaoliang Meng
Shenghui Fang
Ayman Habib
Songlin Fei
18
4
0
20 Dec 2023
Factorization Vision Transformer: Modeling Long Range Dependency with
  Local Window Cost
Factorization Vision Transformer: Modeling Long Range Dependency with Local Window Cost
Haolin Qin
Daquan Zhou
Tingfa Xu
Ziyang Bian
Jianan Li
27
9
0
14 Dec 2023
Transferring Modality-Aware Pedestrian Attentive Learning for
  Visible-Infrared Person Re-identification
Transferring Modality-Aware Pedestrian Attentive Learning for Visible-Infrared Person Re-identification
Yuwei Guo
Wenhao Zhang
Licheng Jiao
Shuang Wang
Shuo Wang
Fang Liu
43
0
0
12 Dec 2023
DYAD: A Descriptive Yet Abjuring Density efficient approximation to
  linear neural network layers
DYAD: A Descriptive Yet Abjuring Density efficient approximation to linear neural network layers
S. Chandy
Varun Gangal
Yi Yang
Gabriel Maggiotti
30
0
0
11 Dec 2023
Graph Information Bottleneck for Remote Sensing Segmentation
Graph Information Bottleneck for Remote Sensing Segmentation
Yuntao Shou
Wei Ai
Tao Meng
SSL
22
25
0
05 Dec 2023
SCHEME: Scalable Channel Mixer for Vision Transformers
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
28
0
0
01 Dec 2023
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio,
  Video, Point Cloud, Time-Series and Image Recognition
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
Xiaohan Ding
Yiyuan Zhang
Yixiao Ge
Sijie Zhao
Lin Song
Xiangyu Yue
Ying Shan
VLM
AI4TS
SSL
21
100
0
27 Nov 2023
SkelVIT: Consensus of Vision Transformers for a Lightweight
  Skeleton-Based Action Recognition System
SkelVIT: Consensus of Vision Transformers for a Lightweight Skeleton-Based Action Recognition System
Özge Öztimur Karadag
ViT
MedIm
18
0
0
14 Nov 2023
Dual-channel Prototype Network for few-shot Classification of
  Pathological Images
Dual-channel Prototype Network for few-shot Classification of Pathological Images
Hao Quan
Xinjia Li
Dayu Hu
Tianhang Nan
Xiaoyu Cui
19
0
0
14 Nov 2023
1234567
Next