ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2101.11605
  4. Cited By
Bottleneck Transformers for Visual Recognition

Bottleneck Transformers for Visual Recognition

27 January 2021
A. Srinivas
Tsung-Yi Lin
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
    SLR
ArXivPDFHTML

Papers citing "Bottleneck Transformers for Visual Recognition"

50 / 341 papers shown
Title
InvPT: Inverted Pyramid Multi-task Transformer for Dense Scene
  Understanding
InvPT: Inverted Pyramid Multi-task Transformer for Dense Scene Understanding
Hanrong Ye
Dan Xu
ViT
19
84
0
15 Mar 2022
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
Xiaohan Ding
X. Zhang
Yi Zhou
Jungong Han
Guiguang Ding
Jian-jun Sun
VLM
47
528
0
13 Mar 2022
StyleBabel: Artistic Style Tagging and Captioning
StyleBabel: Artistic Style Tagging and Captioning
Dan Ruta
Andrew Gilbert
Pranav Aggarwal
Naveen Marri
Ajinkya Kale
...
Hailin Jin
Baldo Faieta
Alex Filipkowski
Zhe-nan Lin
John Collomosse
17
12
0
10 Mar 2022
ParC-Net: Position Aware Circular Convolution with Merits from ConvNets
  and Transformer
ParC-Net: Position Aware Circular Convolution with Merits from ConvNets and Transformer
Haokui Zhang
Wenze Hu
Xiaoyu Wang
ViT
39
59
0
08 Mar 2022
GlideNet: Global, Local and Intrinsic based Dense Embedding NETwork for
  Multi-category Attributes Prediction
GlideNet: Global, Local and Intrinsic based Dense Embedding NETwork for Multi-category Attributes Prediction
Kareem M. Metwaly
Aerin Kim
E. Branson
V. Monga
41
7
0
07 Mar 2022
ViT-P: Rethinking Data-efficient Vision Transformers from Locality
ViT-P: Rethinking Data-efficient Vision Transformers from Locality
B. Chen
Ran A. Wang
Di Ming
Xin Feng
ViT
18
7
0
04 Mar 2022
DenseUNets with feedback non-local attention for the segmentation of
  specular microscopy images of the corneal endothelium with guttae
DenseUNets with feedback non-local attention for the segmentation of specular microscopy images of the corneal endothelium with guttae
Juan P. Vigueras-Guillén
J. Rooij
B. V. Dooren
H. Lemij
E. Islamaj
L. Vliet
Koen A. Vermeer
MedIm
10
12
0
03 Mar 2022
Multi-Tailed Vision Transformer for Efficient Inference
Multi-Tailed Vision Transformer for Efficient Inference
Yunke Wang
Bo Du
Wenyuan Wang
Chang Xu
ViT
208
6
0
03 Mar 2022
Aggregated Pyramid Vision Transformer: Split-transform-merge Strategy
  for Image Recognition without Convolutions
Aggregated Pyramid Vision Transformer: Split-transform-merge Strategy for Image Recognition without Convolutions
Ruikang Ju
Ting-Yu Lin
Jen-Shiun Chiang
Jia-Hao Jian
Yu-Shian Lin
Liu-Rui-Yi Huang
ViT
14
1
0
02 Mar 2022
Learning to Merge Tokens in Vision Transformers
Learning to Merge Tokens in Vision Transformers
Cédric Renggli
André Susano Pinto
N. Houlsby
Basil Mustafa
J. Puigcerver
C. Riquelme
MoMe
19
56
0
24 Feb 2022
Visual Attention Network
Visual Attention Network
Meng-Hao Guo
Chengrou Lu
Zheng-Ning Liu
Ming-Ming Cheng
Shiyong Hu
ViT
VLM
19
637
0
20 Feb 2022
How Do Vision Transformers Work?
How Do Vision Transformers Work?
Namuk Park
Songkuk Kim
ViT
25
465
0
14 Feb 2022
BViT: Broad Attention based Vision Transformer
BViT: Broad Attention based Vision Transformer
Nannan Li
Yaran Chen
Weifan Li
Zixiang Ding
Dong Zhao
ViT
30
23
0
13 Feb 2022
DynaMixer: A Vision MLP Architecture with Dynamic Mixing
DynaMixer: A Vision MLP Architecture with Dynamic Mixing
Ziyu Wang
Wenhao Jiang
Yiming Zhu
Li Yuan
Yibing Song
Wei Liu
37
43
0
28 Jan 2022
UniFormer: Unifying Convolution and Self-attention for Visual
  Recognition
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li
Yali Wang
Junhao Zhang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
144
361
0
24 Jan 2022
Representing Long-Range Context for Graph Neural Networks with Global
  Attention
Representing Long-Range Context for Graph Neural Networks with Global Attention
Zhanghao Wu
Paras Jain
Matthew A. Wright
Azalia Mirhoseini
Joseph E. Gonzalez
Ion Stoica
GNN
35
258
0
21 Jan 2022
A Comprehensive Study of Vision Transformers on Dense Prediction Tasks
A Comprehensive Study of Vision Transformers on Dense Prediction Tasks
Kishaan Jeeveswaran
Senthilkumar S. Kathiresan
Arnav Varma
Omar Magdy
Bahram Zonooz
Elahe Arani
ViT
19
10
0
21 Jan 2022
UniFormer: Unified Transformer for Efficient Spatiotemporal
  Representation Learning
UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning
Kunchang Li
Yali Wang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
35
238
0
12 Jan 2022
A ConvNet for the 2020s
A ConvNet for the 2020s
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
40
4,967
0
10 Jan 2022
Multi-Level Attention for Unsupervised Person Re-Identification
Multi-Level Attention for Unsupervised Person Re-Identification
Y. Zheng
3DPC
8
0
0
10 Jan 2022
Learning Target-aware Representation for Visual Tracking via Informative
  Interactions
Learning Target-aware Representation for Visual Tracking via Informative Interactions
Mingzhe Guo
Zhipeng Zhang
Heng Fan
L. Jing
Yilin Lyu
Bing Li
Weiming Hu
27
50
0
07 Jan 2022
Vision Transformer with Deformable Attention
Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
S. Song
Li Erran Li
Gao Huang
ViT
22
456
0
03 Jan 2022
Accurate and Real-time 3D Pedestrian Detection Using an Efficient
  Attentive Pillar Network
Accurate and Real-time 3D Pedestrian Detection Using an Efficient Attentive Pillar Network
Duy-Tho Le
Hengcan Shi
H. Rezatofighi
Jianfei Cai
3DH
3DPC
11
19
0
31 Dec 2021
Learning Spatially-Adaptive Squeeze-Excitation Networks for Image
  Synthesis and Image Recognition
Learning Spatially-Adaptive Squeeze-Excitation Networks for Image Synthesis and Image Recognition
Jianghao Shen
Tianfu Wu
ViT
16
0
0
29 Dec 2021
SPViT: Enabling Faster Vision Transformers via Soft Token Pruning
SPViT: Enabling Faster Vision Transformers via Soft Token Pruning
Zhenglun Kong
Peiyan Dong
Xiaolong Ma
Xin Meng
Mengshu Sun
...
Geng Yuan
Bin Ren
Minghai Qin
H. Tang
Yanzhi Wang
ViT
26
141
0
27 Dec 2021
MSHT: Multi-stage Hybrid Transformer for the ROSE Image Analysis of
  Pancreatic Cancer
MSHT: Multi-stage Hybrid Transformer for the ROSE Image Analysis of Pancreatic Cancer
Tianyi Zhang
Yunlu Feng
Yu Zhao
Guangda Fan
Aiming Yang
...
Fan Song
Chenbin Ma
Yangyang Sun
Youdan Feng
Guanglei Zhang
ViT
MedIm
14
10
0
27 Dec 2021
Vision Transformer for Small-Size Datasets
Vision Transformer for Small-Size Datasets
Seung Hoon Lee
Seunghyun Lee
B. Song
ViT
8
222
0
27 Dec 2021
Lite Vision Transformer with Enhanced Self-Attention
Lite Vision Transformer with Enhanced Self-Attention
Chenglin Yang
Yilin Wang
Jianming Zhang
He Zhang
Zijun Wei
Zhe-nan Lin
Alan Yuille
ViT
21
112
0
20 Dec 2021
EMDS-6: Environmental Microorganism Image Dataset Sixth Version for
  Image Denoising, Segmentation, Feature Extraction, Classification and
  Detection Methods Evaluation
EMDS-6: Environmental Microorganism Image Dataset Sixth Version for Image Denoising, Segmentation, Feature Extraction, Classification and Detection Methods Evaluation
Penghui Zhao
Chen Li
M. Rahaman
Hao Xu
Pingli Ma
Hechen Yang
Hongzan Sun
Tao Jiang
N. Xu
M. Grzegorzek
24
19
0
14 Dec 2021
Searching the Search Space of Vision Transformer
Searching the Search Space of Vision Transformer
Minghao Chen
Kan Wu
Bolin Ni
Houwen Peng
Bei Liu
Jianlong Fu
Hongyang Chao
Haibin Ling
ViT
22
52
0
29 Nov 2021
On the Integration of Self-Attention and Convolution
On the Integration of Self-Attention and Convolution
Xuran Pan
Chunjiang Ge
Rui Lu
S. Song
Guanfu Chen
Zeyi Huang
Gao Huang
SSL
36
287
0
29 Nov 2021
Learning A 3D-CNN and Transformer Prior for Hyperspectral Image
  Super-Resolution
Learning A 3D-CNN and Transformer Prior for Hyperspectral Image Super-Resolution
Qing Ma
Junjun Jiang
Xianming Liu
Jiayi Ma
ViT
26
53
0
27 Nov 2021
SWAT: Spatial Structure Within and Among Tokens
SWAT: Spatial Structure Within and Among Tokens
Kumara Kahatapitiya
Michael S. Ryoo
23
6
0
26 Nov 2021
BoxeR: Box-Attention for 2D and 3D Transformers
BoxeR: Box-Attention for 2D and 3D Transformers
Duy-Kien Nguyen
Jihong Ju
Olaf Booji
Martin R. Oswald
Cees G. M. Snoek
ViT
26
36
0
25 Nov 2021
MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal
  Representation Learning
MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning
David Junhao Zhang
Kunchang Li
Yali Wang
Yuxiang Chen
Shashwat Chandra
Yu Qiao
Luoqi Liu
Mike Zheng Shou
AI4TS
19
30
0
24 Nov 2021
An Image Patch is a Wave: Phase-Aware Vision MLP
An Image Patch is a Wave: Phase-Aware Vision MLP
Yehui Tang
Kai Han
Jianyuan Guo
Chang Xu
Yanxi Li
Chao Xu
Yunhe Wang
22
133
0
24 Nov 2021
INTERN: A New Learning Paradigm Towards General Vision
INTERN: A New Learning Paradigm Towards General Vision
Jing Shao
Siyu Chen
Yangguang Li
Kun Wang
Zhen-fei Yin
...
F. Yu
Junjie Yan
Dahua Lin
Xiaogang Wang
Yu Qiao
13
34
0
16 Nov 2021
CAR -- Cityscapes Attributes Recognition A Multi-category Attributes
  Dataset for Autonomous Vehicles
CAR -- Cityscapes Attributes Recognition A Multi-category Attributes Dataset for Autonomous Vehicles
Kareem M. Metwaly
Aerin Kim
E. Branson
V. Monga
27
4
0
16 Nov 2021
A Survey of Visual Transformers
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGS
ViT
71
330
0
11 Nov 2021
Sliced Recursive Transformer
Sliced Recursive Transformer
Zhiqiang Shen
Zechun Liu
Eric P. Xing
ViT
17
27
0
09 Nov 2021
Gabor filter incorporated CNN for compression
Gabor filter incorporated CNN for compression
Akihiro Imamura
N. Arizumi
CVBM
20
2
0
29 Oct 2021
Dispensed Transformer Network for Unsupervised Domain Adaptation
Dispensed Transformer Network for Unsupervised Domain Adaptation
Yunxiang Li
Jingxiong Li
Ruilong Dan
Shuai Wang
Kai Jin
...
Qianni Zhang
Huiyu Zhou
Qun Jin
Li Wang
Yaqi Wang
OOD
MedIm
15
4
0
28 Oct 2021
Enabling Large Batch Size Training for DNN Models Beyond the Memory
  Limit While Maintaining Performance
Enabling Large Batch Size Training for DNN Models Beyond the Memory Limit While Maintaining Performance
Nathanaël Fijalkow
DoangJoo Synn
Jooyoung Park
Jong-Kook Kim
14
5
0
24 Oct 2021
SOFT: Softmax-free Transformer with Linear Complexity
SOFT: Softmax-free Transformer with Linear Complexity
Jiachen Lu
Jinghan Yao
Junge Zhang
Martin Danelljan
Hang Xu
Weiguo Gao
Chunjing Xu
Thomas B. Schon
Li Zhang
16
161
0
22 Oct 2021
HRFormer: High-Resolution Transformer for Dense Prediction
HRFormer: High-Resolution Transformer for Dense Prediction
Yuhui Yuan
Rao Fu
Lang Huang
Weihong Lin
Chao Zhang
Xilin Chen
Jingdong Wang
ViT
24
226
0
18 Oct 2021
Energon: Towards Efficient Acceleration of Transformers Using Dynamic
  Sparse Attention
Energon: Towards Efficient Acceleration of Transformers Using Dynamic Sparse Attention
Zhe Zhou
Junling Liu
Zhenyu Gu
Guangyu Sun
56
42
0
18 Oct 2021
Revitalizing CNN Attentions via Transformers in Self-Supervised Visual
  Representation Learning
Revitalizing CNN Attentions via Transformers in Self-Supervised Visual Representation Learning
Chongjian Ge
Youwei Liang
Yibing Song
Jianbo Jiao
Jue Wang
Ping Luo
ViT
16
36
0
11 Oct 2021
UniNet: Unified Architecture Search with Convolution, Transformer, and
  MLP
UniNet: Unified Architecture Search with Convolution, Transformer, and MLP
Jihao Liu
Hongsheng Li
Guanglu Song
Xin Huang
Yu Liu
ViT
31
35
0
08 Oct 2021
Dynamically Decoding Source Domain Knowledge for Domain Generalization
Dynamically Decoding Source Domain Knowledge for Domain Generalization
Cuicui Kang
Karthik Nandakumar
OOD
ViT
24
1
0
06 Oct 2021
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision
  Transformer
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer
Sachin Mehta
Mohammad Rastegari
ViT
191
1,212
0
05 Oct 2021
Previous
1234567
Next