Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.11605
Cited By
v1
v2 (latest)
Bottleneck Transformers for Visual Recognition
27 January 2021
A. Srinivas
Nayeon Lee
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Bottleneck Transformers for Visual Recognition"
50 / 339 papers shown
Title
Deep Active Learning in the Presence of Label Noise: A Survey
Moseli Motsóehli
Kyungim Baek
NoLa
VLM
81
5
0
22 Feb 2023
Device Tuning for Multi-Task Large Model
Penghao Jiang
Xuanchen Hou
Y. Zhou
40
0
0
21 Feb 2023
MedViT: A Robust Vision Transformer for Generalized Medical Image Classification
Omid Nejati Manzari
Hamid Ahmadabadi
Hossein Kashiani
S. B. Shokouhi
Ahmad Ayatollahi
ViT
MedIm
125
204
0
19 Feb 2023
Hyneter: Hybrid Network Transformer for Object Detection
Dong Chen
Duoqian Miao
Xuepeng Zhao
ViT
78
4
0
18 Feb 2023
Efficiency 360: Efficient Vision Transformers
Badri N. Patro
Vijay Srinivas Agneeswaran
163
6
0
16 Feb 2023
Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames
Ondrej Biza
Sjoerd van Steenkiste
Mehdi S. M. Sajjadi
Gamaleldin F. Elsayed
Aravindh Mahendran
Thomas Kipf
OCL
133
37
0
09 Feb 2023
DilateFormer: Multi-Scale Dilated Transformer for Visual Recognition
Jiayu Jiao
Yuyao Tang
Kun-Li Channing Lin
Yipeng Gao
Jinhua Ma
Yaowei Wang
Wei-Shi Zheng
MedIm
ViT
98
156
0
03 Feb 2023
Cluster-CAM: Cluster-Weighted Visual Interpretation of CNNs' Decision in Image Classification
Zhenpeng Feng
H. Ji
M. Daković
Xiyang Cui
Mingzhe Zhu
Ljubisa Stankovic
72
8
0
03 Feb 2023
Semantic Segmentation Enhanced Transformer Model for Human Attention Prediction
Shuo Zhang
ViT
68
0
0
26 Jan 2023
Part-guided Relational Transformers for Fine-grained Visual Recognition
Yifan Zhao
Jia Li
Xiaowu Chen
Yonghong Tian
ViT
90
37
0
28 Dec 2022
Multi-Scale Feature Fusion Transformer Network for End-to-End Single Channel Speech Separation
Yinhao Xu
Jian Zhou
L. Tao
H. Kwan
106
0
0
14 Dec 2022
CamoFormer: Masked Separable Attention for Camouflaged Object Detection
Bo Yin
Xuying Zhang
Qibin Hou
Bo Sun
Deng-Ping Fan
Luc Van Gool
104
59
0
10 Dec 2022
Cross-Domain Synthetic-to-Real In-the-Wild Depth and Normal Estimation for 3D Scene Understanding
Jay Bhanushali
Manivannan Muniyandi
Praneeth Chakravarthula
3DPC
ViT
73
2
0
09 Dec 2022
Dunhuang murals contour generation network based on convolution and self-attention fusion
Bao-Yu Liu
Fengjie He
Shiqiang Du
Kaiwu Zhang
Jianhua Wang
3DPC
93
6
0
02 Dec 2022
Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images
Meng Wang
Kai-An Yu
Chun-Mei Feng
K. Zou
Yanyu Xu
Qingquan Meng
Rick Siow Mong Goh
Yong Liu
Huazhu Fu
MedIm
93
3
0
01 Dec 2022
Degenerate Swin to Win: Plain Window-based Transformer without Sophisticated Operations
Tan Yu
Ping Li
ViT
81
5
0
25 Nov 2022
Learnable Spectral Wavelets on Dynamic Graphs to Capture Global Interactions
Anson Bastos
Abhishek Nadgeri
Kuldeep Singh
Toyotaro Suzumura
Manish Singh
80
8
0
22 Nov 2022
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Qibin Hou
Cheng Lu
Mingg-Ming Cheng
Jiashi Feng
ViT
126
141
0
22 Nov 2022
Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training
Zhenglun Kong
Haoyu Ma
Geng Yuan
Mengshu Sun
Yanyue Xie
...
Tianlong Chen
Xiaolong Ma
Xiaohui Xie
Zhangyang Wang
Yanzhi Wang
ViT
114
24
0
19 Nov 2022
Vision Transformers in Medical Imaging: A Review
Emerald U. Henry
Onyeka Emebob
C. Omonhinmin
ViT
MedIm
93
36
0
18 Nov 2022
Parameter-Efficient Transformer with Hybrid Axial-Attention for Medical Image Segmentation
Yiyue Hu
Lei Zhang
Nan Mu
Leijun Liu
ViT
MedIm
44
1
0
17 Nov 2022
Fcaformer: Forward Cross Attention in Hybrid Vision Transformer
Haokui Zhang
Wenze Hu
Xiaoyu Wang
ViT
63
8
0
14 Nov 2022
ParCNetV2: Oversized Kernel with Enhanced Attention
Ruihan Xu
Haokui Zhang
Wenze Hu
Shiliang Zhang
Xiaoyu Wang
ViT
85
6
0
14 Nov 2022
Studying inductive biases in image classification task
N. Arizumi
64
1
0
31 Oct 2022
An Effective Deep Network for Head Pose Estimation without Keypoints
Chien Thai
Viet Tran
Minh Bui
Huong Ninh
Hai Yen Tran
3DH
CVBM
31
3
0
25 Oct 2022
LCPFormer: Towards Effective 3D Point Cloud Analysis via Local Context Propagation in Transformers
Zhuo Huang
Zhiyou Zhao
Banghuai Li
Jungong Han
3DPC
ViT
102
58
0
23 Oct 2022
Similarity of Neural Architectures using Adversarial Attack Transferability
Ian Ryu
Dongyoon Han
Byeongho Heo
Song Park
Sanghyuk Chun
Jong-Seok Lee
AAML
136
2
0
20 Oct 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
113
51
0
13 Oct 2022
FontTransformer: Few-shot High-resolution Chinese Glyph Image Synthesis via Stacked Transformers
Yitian Liu
Zheng Lian
107
14
0
12 Oct 2022
SaiT: Sparse Vision Transformers through Adaptive Token Pruning
Ling Li
D. Thorsley
Joseph Hassoun
ViT
41
19
0
11 Oct 2022
Block Format Error Bounds and Optimal Block Size Selection
I. Soloveychik
I. Lyubomirsky
Xin Eric Wang
S. Bhoja
MQ
63
4
0
11 Oct 2022
Fast-ParC: Capturing Position Aware Global Feature for ConvNets and ViTs
Taojiannan Yang
Haokui Zhang
Wenze Hu
Chen Chen
Xiaoyu Wang
ViT
78
0
0
08 Oct 2022
Time-Space Transformers for Video Panoptic Segmentation
Andra Petrovai
S. Nedevschi
ViT
56
3
0
07 Oct 2022
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models
Chenglin Yang
Siyuan Qiao
Qihang Yu
Xiaoding Yuan
Yukun Zhu
Alan Yuille
Hartwig Adam
Liang-Chieh Chen
ViT
MoE
120
66
0
04 Oct 2022
Exploring the Relationship between Architecture and Adversarially Robust Generalization
Aishan Liu
Shiyu Tang
Siyuan Liang
Ruihao Gong
Boxi Wu
Xianglong Liu
Dacheng Tao
AAML
93
19
0
28 Sep 2022
Dynamic Graph Message Passing Networks for Visual Recognition
Li Zhang
Mohan Chen
Anurag Arnab
Xiangyang Xue
Philip Torr
GNN
66
1
0
20 Sep 2022
Swin-transformer-yolov5 For Real-time Wine Grape Bunch Detection
Shenglian Lu
Xiaoyu Liu
Zixaun He
Wenbo Liu
Xin Zhang
Manoj Karkee
84
40
0
30 Aug 2022
MRL: Learning to Mix with Attention and Convolutions
Shlok Mohta
Hisahiro Suganuma
Yoshiki Tanaka
106
2
0
30 Aug 2022
gSwin: Gated MLP Vision Model with Hierarchical Structure of Shifted Window
Mocho Go
Hideyuki Tachibana
ViT
66
9
0
24 Aug 2022
FocusFormer: Focusing on What We Need via Architecture Sampler
Jing Liu
Jianfei Cai
Bohan Zhuang
65
8
0
23 Aug 2022
DPTNet: A Dual-Path Transformer Architecture for Scene Text Detection
Jingyu Lin
Jie Jiang
Y. Yan
Chunchao Guo
Hongfa Wang
Wei Liu
Hanzi Wang
ViT
62
3
0
21 Aug 2022
Improved Image Classification with Token Fusion
Keong-Hun Choi
Jin-Woo Kim
Yaolong Wang
J. Ha
ViT
46
0
0
19 Aug 2022
A Vision Transformer-Based Approach to Bearing Fault Classification via Vibration Signals
Abid Hasan Zim
Aeyan Ashraf
Aquib Iqbal
Asad U. Malik
Minoru Kuribayashi
35
11
0
15 Aug 2022
Recent Progress in Transformer-based Medical Image Analysis
Zhao-cheng Liu
Qiujie Lv
Ziduo Yang
Yifan Li
Chau Hung Lee
Leizhao Shen
MedIm
81
66
0
13 Aug 2022
Memorizing Complementation Network for Few-Shot Class-Incremental Learning
Zhong Ji
Zhi Hou
Xiyao Liu
Yanwei Pang
Xuelong Li
CLL
80
51
0
11 Aug 2022
Label-Efficient Domain Generalization via Collaborative Exploration and Generalization
Junkun Yuan
Xu Ma
Defang Chen
Kun Kuang
Leilei Gan
Lanfen Lin
82
25
0
07 Aug 2022
Understanding Adversarial Robustness of Vision Transformers via Cauchy Problem
Zheng Wang
Wenjie Ruan
ViT
82
8
0
01 Aug 2022
Convolutional Embedding Makes Hierarchical Vision Transformer Stronger
Cong Wang
Hongmin Xu
Xiong Zhang
Li Wang
Zhitong Zheng
Haifeng Liu
ViT
59
23
0
27 Jul 2022
Jigsaw-ViT: Learning Jigsaw Puzzles in Vision Transformer
Yingyi Chen
Xiaoke Shen
Yahui Liu
Qinghua Tao
Johan A. K. Suykens
AAML
ViT
85
24
0
25 Jul 2022
SSBNet: Improving Visual Recognition Efficiency by Adaptive Sampling
Ho Man Kwan
Shenghui Song
51
1
0
23 Jul 2022
Previous
1
2
3
4
5
6
7
Next