ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.03677
  4. Cited By
Visual Transformers: Token-based Image Representation and Processing for
  Computer Vision

Visual Transformers: Token-based Image Representation and Processing for Computer Vision

5 June 2020
Bichen Wu
Chenfeng Xu
Xiaoliang Dai
Alvin Wan
Peizhao Zhang
Zhicheng Yan
M. Tomizuka
Joseph E. Gonzalez
Kurt Keutzer
Peter Vajda
    ViT
ArXivPDFHTML

Papers citing "Visual Transformers: Token-based Image Representation and Processing for Computer Vision"

36 / 86 papers shown
Title
MEDUSA: Multi-scale Encoder-Decoder Self-Attention Deep Neural Network
  Architecture for Medical Image Analysis
MEDUSA: Multi-scale Encoder-Decoder Self-Attention Deep Neural Network Architecture for Medical Image Analysis
Hossein Aboutalebi
Maya Pavlova
Hayden Gunraj
M. Shafiee
A. Sabri
Amer Alaref
Alexander Wong
22
17
0
12 Oct 2021
Pathologies in priors and inference for Bayesian transformers
Pathologies in priors and inference for Bayesian transformers
Tristan Cinquin
Alexander Immer
Max Horn
Vincent Fortuin
UQCV
BDL
MedIm
34
9
0
08 Oct 2021
Token Pooling in Vision Transformers
Token Pooling in Vision Transformers
D. Marin
Jen-Hao Rick Chang
Anurag Ranjan
Anish K. Prabhu
Mohammad Rastegari
Oncel Tuzel
ViT
76
66
0
08 Oct 2021
Subdimensional Expansion Using Attention-Based Learning For Multi-Agent
  Path Finding
Subdimensional Expansion Using Attention-Based Learning For Multi-Agent Path Finding
Lakshay Virmani
Z. Ren
Sivakumar Rathinam
Howie Choset
26
3
0
29 Sep 2021
GroupFormer: Group Activity Recognition with Clustered Spatial-Temporal
  Transformer
GroupFormer: Group Activity Recognition with Clustered Spatial-Temporal Transformer
Shuaicheng Li
Qianggang Cao
Lingbo Liu
Kunlin Yang
Shinan Liu
Jun Hou
Shuai Yi
ViT
34
103
0
28 Aug 2021
SwinIR: Image Restoration Using Swin Transformer
SwinIR: Image Restoration Using Swin Transformer
Jingyun Liang
Jie Cao
Guolei Sun
Kaixuan Zhang
Luc Van Gool
Radu Timofte
ViT
45
2,808
0
23 Aug 2021
Do Vision Transformers See Like Convolutional Neural Networks?
Do Vision Transformers See Like Convolutional Neural Networks?
M. Raghu
Thomas Unterthiner
Simon Kornblith
Chiyuan Zhang
Alexey Dosovitskiy
ViT
52
924
0
19 Aug 2021
Causal Attention for Unbiased Visual Recognition
Causal Attention for Unbiased Visual Recognition
Tan Wang
Chan Zhou
Qianru Sun
Hanwang Zhang
OOD
CML
32
108
0
19 Aug 2021
TriTransNet: RGB-D Salient Object Detection with a Triplet Transformer
  Embedding Network
TriTransNet: RGB-D Salient Object Detection with a Triplet Transformer Embedding Network
Zhengyi Liu
Yuan Wang
Zhengzheng Tu
Yun Xiao
Bin Tang
ViT
32
142
0
09 Aug 2021
Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight
  Transformer
Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer
Zhihe Lu
Sen He
Xiatian Zhu
Li Zhang
Yi-Zhe Song
Tao Xiang
ViT
171
173
0
06 Aug 2021
PPT Fusion: Pyramid Patch Transformerfor a Case Study in Image Fusion
PPT Fusion: Pyramid Patch Transformerfor a Case Study in Image Fusion
Yu Fu
Tianyang Xu
Xiaojun Wu
J. Kittler
ViT
24
37
0
29 Jul 2021
Visual Parser: Representing Part-whole Hierarchies with Transformers
Visual Parser: Representing Part-whole Hierarchies with Transformers
Shuyang Sun
Xiaoyu Yue
S. Bai
Philip H. S. Torr
50
27
0
13 Jul 2021
Co-advise: Cross Inductive Bias Distillation
Co-advise: Cross Inductive Bias Distillation
Sucheng Ren
Zhengqi Gao
Tianyu Hua
Zihui Xue
Yonglong Tian
Shengfeng He
Hang Zhao
49
53
0
23 Jun 2021
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
Michael S. Ryoo
A. Piergiovanni
Anurag Arnab
Mostafa Dehghani
A. Angelova
ViT
32
127
0
21 Jun 2021
OadTR: Online Action Detection with Transformers
OadTR: Online Action Detection with Transformers
Xiang Wang
Shiwei Zhang
Zhiwu Qing
Yuanjie Shao
Zhe Zuo
Changxin Gao
Nong Sang
OffRL
ViT
34
109
0
21 Jun 2021
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Mandela Patrick
Dylan Campbell
Yuki M. Asano
Ishan Misra
Ishan Misra Florian Metze
Christoph Feichtenhofer
Andrea Vedaldi
João F. Henriques
8
274
0
09 Jun 2021
Visual Transformer for Task-aware Active Learning
Visual Transformer for Task-aware Active Learning
Razvan Caramalau
Binod Bhattarai
Tae-Kyun Kim
ViT
8
11
0
07 Jun 2021
Person Re-Identification with a Locally Aware Transformer
Person Re-Identification with a Locally Aware Transformer
Charu Sharma
S. R. Kapil
David Chapman
ViT
45
45
0
07 Jun 2021
Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model
Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model
Jiangning Zhang
Chao Xu
Jian Li
Wenzhou Chen
Yabiao Wang
Ying Tai
Shuo Chen
Chengjie Wang
Feiyue Huang
Yong Liu
29
22
0
31 May 2021
T-EMDE: Sketching-based global similarity for cross-modal retrieval
T-EMDE: Sketching-based global similarity for cross-modal retrieval
Barbara Rychalska
Mikolaj Wieczorek
Jacek Dąbrowski
25
0
0
10 May 2021
Conformer: Local Features Coupling Global Representations for Visual
  Recognition
Conformer: Local Features Coupling Global Representations for Visual Recognition
Zhiliang Peng
Wei Huang
Shanzhi Gu
Lingxi Xie
Yaowei Wang
Jianbin Jiao
QiXiang Ye
ViT
15
527
0
09 May 2021
Multiscale Vision Transformers
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
63
1,222
0
22 Apr 2021
A novel time-frequency Transformer based on self-attention mechanism and
  its application in fault diagnosis of rolling bearings
A novel time-frequency Transformer based on self-attention mechanism and its application in fault diagnosis of rolling bearings
Yifei Ding
M. Jia
Qiuhua Miao
Yudong Cao
16
268
0
19 Apr 2021
AAformer: Auto-Aligned Transformer for Person Re-Identification
AAformer: Auto-Aligned Transformer for Person Re-Identification
Kuan Zhu
Haiyun Guo
Shiliang Zhang
Yaowei Wang
Jing Liu
Jinqiao Wang
Ming Tang
ViT
35
111
0
02 Apr 2021
Bridging Global Context Interactions for High-Fidelity Image Completion
Bridging Global Context Interactions for High-Fidelity Image Completion
Chuanxia Zheng
Tat-Jen Cham
Jianfei Cai
Dinh Q. Phung
ViT
35
78
0
02 Apr 2021
Going deeper with Image Transformers
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
27
986
0
31 Mar 2021
Dual Contrastive Loss and Attention for GANs
Dual Contrastive Loss and Attention for GANs
Ning Yu
Guilin Liu
Aysegül Dündar
Andrew Tao
Bryan Catanzaro
Larry S. Davis
Mario Fritz
GAN
34
60
0
31 Mar 2021
TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised
  Object Localization
TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization
Wei Gao
Fang Wan
Xingjia Pan
Zhiliang Peng
Qi Tian
Zhenjun Han
Bolei Zhou
QiXiang Ye
ViT
WSOL
30
198
0
27 Mar 2021
Multi-view analysis of unregistered medical images using cross-view
  transformers
Multi-view analysis of unregistered medical images using cross-view transformers
Gijs van Tulder
Yao Tong
E. Marchiori
ViT
19
59
0
21 Mar 2021
ConViT: Improving Vision Transformers with Soft Convolutional Inductive
  Biases
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
Stéphane dÁscoli
Hugo Touvron
Matthew L. Leavitt
Ari S. Morcos
Giulio Biroli
Levent Sagun
ViT
49
804
0
19 Mar 2021
Space-Time Crop & Attend: Improving Cross-modal Video Representation
  Learning
Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning
Mandela Patrick
Yuki M. Asano
Bernie Huang
Ishan Misra
Florian Metze
Joao Henriques
Andrea Vedaldi
AI4TS
24
33
0
18 Mar 2021
Perceiver: General Perception with Iterative Attention
Perceiver: General Perception with Iterative Attention
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLM
ViT
MDE
59
973
0
04 Mar 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
269
179
0
17 Feb 2021
Point Cloud Transformers applied to Collider Physics
Point Cloud Transformers applied to Collider Physics
Vinicius Mikuni
F. Canelli
ViT
14
55
0
09 Feb 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
41
39,266
0
22 Oct 2020
Graph-Based Global Reasoning Networks
Graph-Based Global Reasoning Networks
Yunpeng Chen
Marcus Rohrbach
Zhicheng Yan
Shuicheng Yan
Jiashi Feng
Yannis Kalantidis
GNN
NAI
268
457
0
30 Nov 2018
Previous
12