ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.03677
  4. Cited By
Visual Transformers: Token-based Image Representation and Processing for
  Computer Vision

Visual Transformers: Token-based Image Representation and Processing for Computer Vision

5 June 2020
Bichen Wu
Chenfeng Xu
Xiaoliang Dai
Alvin Wan
Peizhao Zhang
Zhicheng Yan
M. Tomizuka
Joseph E. Gonzalez
Kurt Keutzer
Peter Vajda
    ViT
ArXivPDFHTML

Papers citing "Visual Transformers: Token-based Image Representation and Processing for Computer Vision"

50 / 88 papers shown
Title
GMAR: Gradient-Driven Multi-Head Attention Rollout for Vision Transformer Interpretability
GMAR: Gradient-Driven Multi-Head Attention Rollout for Vision Transformer Interpretability
Sehyeong Jo
Gangjae Jang
Haesol Park
32
0
0
28 Apr 2025
Topology-Aware Conformal Prediction for Stream Networks
Jifan Zhang
Fangxin Wang
Philip S. Yu
Kaize Ding
Shixiang Zhu
AI4TS
39
0
0
06 Mar 2025
Exploring Visual Embedding Spaces Induced by Vision Transformers for Online Auto Parts Marketplaces
Cameron Armijo
Pablo Rivas
41
0
0
09 Feb 2025
Dynamic Negative Guidance of Diffusion Models
Dynamic Negative Guidance of Diffusion Models
Felix Koulischer
Johannes Deleu
G. Raya
T. Demeester
L. Ambrogioni
DiffM
49
2
0
03 Jan 2025
Cauchy activation function and XNet
Cauchy activation function and XNet
Xin Li
Zhihong Xia
Hongkun Zhang
40
4
0
28 Sep 2024
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
Stephen Zhang
V. Papyan
VLM
51
1
0
20 Sep 2024
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function
  Landscapes
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes
Nikita Kiselev
Andrey Grabovoy
54
1
0
18 Sep 2024
EDADepth: Enhanced Data Augmentation for Monocular Depth Estimation
EDADepth: Enhanced Data Augmentation for Monocular Depth Estimation
Nischal Khanal
Shivanand Venkanna Sheshappanavar
MDE
42
0
0
10 Sep 2024
Multi-Modal Multi-Granularity Tokenizer for Chu Bamboo Slip Scripts
Multi-Modal Multi-Granularity Tokenizer for Chu Bamboo Slip Scripts
Yingfa Chen
Chenlong Hu
Cong Feng
Chenyang Song
Shi Yu
Xu Han
Zhiyuan Liu
Maosong Sun
28
0
0
02 Sep 2024
SwinSF: Image Reconstruction from Spatial-Temporal Spike Streams
SwinSF: Image Reconstruction from Spatial-Temporal Spike Streams
Liangyan Jiang
Chuang Zhu
Yanxu Chen
52
2
0
22 Jul 2024
Restyling Unsupervised Concept Based Interpretable Networks with Generative Models
Restyling Unsupervised Concept Based Interpretable Networks with Generative Models
Jayneel Parekh
Quentin Bouniot
Pavlo Mozharovskyi
A. Newson
Florence dÁlché-Buc
SSL
63
1
0
01 Jul 2024
Restoring balance: principled under/oversampling of data for optimal classification
Restoring balance: principled under/oversampling of data for optimal classification
Emanuele Loffredo
Mauro Pastore
Simona Cocco
R. Monasson
43
9
0
15 May 2024
PhysMLE: Generalizable and Priors-Inclusive Multi-task Remote
  Physiological Measurement
PhysMLE: Generalizable and Priors-Inclusive Multi-task Remote Physiological Measurement
Jiyao Wang
Hao Lu
Ange Wang
Xiao Yang
Ying Chen
Dengbo He
Kaishun Wu
26
3
0
10 May 2024
Optical Text Recognition in Nepali and Bengali: A Transformer-based
  Approach
Optical Text Recognition in Nepali and Bengali: A Transformer-based Approach
Rakib Hasan
Aakar Dhakal
Kabir Mehedi
Annajiat Alim Rasel
21
1
0
03 Apr 2024
Sparse and Transferable Universal Singular Vectors Attack
Sparse and Transferable Universal Singular Vectors Attack
Kseniia Kuvshinova
Olga Tsymboi
Ivan V. Oseledets
AAML
38
0
0
25 Jan 2024
Enhancing Context Through Contrast
Enhancing Context Through Contrast
Kshitij Ambilduke
Aneesh Shetye
Diksha Bagade
Rishika Bhagwatkar
Khurshed Fitter
P. Vagdargi
Shital S. Chiddarwar
26
0
0
06 Jan 2024
Improving Robustness for Vision Transformer with a Simple Dynamic
  Scanning Augmentation
Improving Robustness for Vision Transformer with a Simple Dynamic Scanning Augmentation
Shashank Kotyan
Danilo Vasconcellos Vargas
ViT
27
2
0
01 Nov 2023
Energy-Based Models for Cross-Modal Localization using Convolutional
  Transformers
Energy-Based Models for Cross-Modal Localization using Convolutional Transformers
Alan Wu
Michael S. Ryoo
33
3
0
06 Jun 2023
Images in Language Space: Exploring the Suitability of Large Language
  Models for Vision & Language Tasks
Images in Language Space: Exploring the Suitability of Large Language Models for Vision & Language Tasks
Sherzod Hakimov
David Schlangen
VLM
36
5
0
23 May 2023
SwinFSR: Stereo Image Super-Resolution using SwinIR and Frequency Domain
  Knowledge
SwinFSR: Stereo Image Super-Resolution using SwinIR and Frequency Domain Knowledge
Ke-Jia Chen
Liangyan Li
Huan Liu
Yunzhe Li
Congling Tang
Jun Chen
31
14
0
25 Apr 2023
Classification in Histopathology: A unique deep embeddings extractor for
  multiple classification tasks
Classification in Histopathology: A unique deep embeddings extractor for multiple classification tasks
A. Nivaggioli
Nicolas Pozin
Rémy Peyret
Stéphane Sockeel
Marie Sockeel
Nicolas Nerrienet
Marceau Clavel
Clara Simmat
C. Miquel
MedIm
11
0
0
09 Mar 2023
STB-VMM: Swin Transformer Based Video Motion Magnification
STB-VMM: Swin Transformer Based Video Motion Magnification
Ricard Lado-Roigé
M. A. Pérez
18
13
0
20 Feb 2023
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization
Kayhan Behdin
Qingquan Song
Aman Gupta
S. Keerthi
Ayan Acharya
Borja Ocejo
Gregory Dexter
Rajiv Khanna
D. Durfee
Rahul Mazumder
AAML
18
7
0
19 Feb 2023
Transformadores: Fundamentos teoricos y Aplicaciones
Transformadores: Fundamentos teoricos y Aplicaciones
J. D. L. Torre
75
0
0
18 Feb 2023
Sample-efficient Surrogate Model for Frequency Response of Linear PDEs
  using Self-Attentive Complex Polynomials
Sample-efficient Surrogate Model for Frequency Response of Linear PDEs using Self-Attentive Complex Polynomials
A. Cohen
W. Dou
Jiang Zhu
S. Koziel
Péter Renner
J. Mattsson
Xiaomeng Yang
Beidi Chen
Kevin R. Stone
Yuandong Tian
26
0
0
06 Jan 2023
MoBYv2AL: Self-supervised Active Learning for Image Classification
MoBYv2AL: Self-supervised Active Learning for Image Classification
Razvan Caramalau
Binod Bhattarai
Danail Stoyanov
Tae-Kyun Kim
SSL
27
7
0
04 Jan 2023
Explanation on Pretraining Bias of Finetuned Vision Transformer
Explanation on Pretraining Bias of Finetuned Vision Transformer
Bumjin Park
Jaesik Choi
ViT
31
1
0
18 Nov 2022
ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design
ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design
Haoran You
Zhanyi Sun
Huihong Shi
Zhongzhi Yu
Yang Katie Zhao
Yongan Zhang
Chaojian Li
Baopu Li
Yingyan Lin
ViT
22
76
0
18 Oct 2022
Traffic Accident Risk Forecasting using Contextual Vision Transformers
Traffic Accident Risk Forecasting using Contextual Vision Transformers
Khaled Saleh
Artur Grigorev
Adriana-Simona Mihaita
ViT
32
10
0
20 Sep 2022
Transformer-CNN Cohort: Semi-supervised Semantic Segmentation by the
  Best of Both Students
Transformer-CNN Cohort: Semi-supervised Semantic Segmentation by the Best of Both Students
Xueye Zheng
Yuan Luo
Hao Wang
Chong Fu
Lin Wang
ViT
41
18
0
06 Sep 2022
Open-Vocabulary 3D Detection via Image-level Class and Debiased
  Cross-modal Contrastive Learning
Open-Vocabulary 3D Detection via Image-level Class and Debiased Cross-modal Contrastive Learning
Yuheng Lu
Chenfeng Xu
Xi Wei
Xiaodong Xie
M. Tomizuka
Kurt Keutzer
Shanghang Zhang
3DPC
25
20
0
05 Jul 2022
Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models
Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models
Yang Shu
Zhangjie Cao
Ziyang Zhang
Jianmin Wang
Mingsheng Long
17
4
0
08 Jun 2022
LIA: Privacy-Preserving Data Quality Evaluation in Federated Learning
  Using a Lazy Influence Approximation
LIA: Privacy-Preserving Data Quality Evaluation in Federated Learning Using a Lazy Influence Approximation
Ljubomir Rokvic
Panayiotis Danassis
Sai Praneeth Karimireddy
Boi Faltings
TDI
27
1
0
23 May 2022
Activating More Pixels in Image Super-Resolution Transformer
Activating More Pixels in Image Super-Resolution Transformer
Xiangyu Chen
Xintao Wang
Jiantao Zhou
Yu Qiao
Chao Dong
ViT
64
601
0
09 May 2022
Seeding Diversity into AI Art
Seeding Diversity into AI Art
Marvin Zammit
Antonios Liapis
Georgios N. Yannakakis
24
4
0
02 May 2022
MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral
  Reconstruction
MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction
Yuanhao Cai
Jing Lin
Zudi Lin
Haoqian Wang
Yulun Zhang
Hanspeter Pfister
Radu Timofte
Luc Van Gool
19
171
0
17 Apr 2022
Evolving Modular Soft Robots without Explicit Inter-Module Communication
  using Local Self-Attention
Evolving Modular Soft Robots without Explicit Inter-Module Communication using Local Self-Attention
F. Pigozzi
Yujin Tang
Eric Medvet
David R Ha
39
22
0
13 Apr 2022
Transformer-Based Self-Supervised Learning for Emotion Recognition
Transformer-Based Self-Supervised Learning for Emotion Recognition
Juan Vazquez-Rodriguez
G. Lefebvre
Julien Cumin
James L. Crowley
12
25
0
08 Apr 2022
Deep Transformers Thirst for Comprehensive-Frequency Data
Deep Transformers Thirst for Comprehensive-Frequency Data
R. Xia
Chao Xue
Boyu Deng
Fang Wang
Jingchao Wang
ViT
25
0
0
14 Mar 2022
Efficient Long-Range Attention Network for Image Super-resolution
Efficient Long-Range Attention Network for Image Super-resolution
Xindong Zhang
Huiyu Zeng
Shi Guo
Lei Zhang
ViT
19
276
0
13 Mar 2022
EventFormer: AU Event Transformer for Facial Action Unit Event Detection
EventFormer: AU Event Transformer for Facial Action Unit Event Detection
Yingjie Chen
Jiarui Zhang
Tao Wang
Yun Liang
ViT
29
0
0
12 Mar 2022
Region-Aware Face Swapping
Region-Aware Face Swapping
Chao Xu
Jiangning Zhang
Miao Hua
Qian He
Zili Yi
Yong Liu
CVBM
22
49
0
09 Mar 2022
RFormer: Transformer-based Generative Adversarial Network for Real
  Fundus Image Restoration on A New Clinical Benchmark
RFormer: Transformer-based Generative Adversarial Network for Real Fundus Image Restoration on A New Clinical Benchmark
Zhuo Deng
Yuanhao Cai
Lu Chen
Zheng Gong
Qiqi Bao
Xue Yao
D. Fang
Shaochong Zhang
Lan Ma
ViT
MedIm
33
53
0
03 Jan 2022
Augmenting Convolutional networks with attention-based aggregation
Augmenting Convolutional networks with attention-based aggregation
Hugo Touvron
Matthieu Cord
Alaaeldin El-Nouby
Piotr Bojanowski
Armand Joulin
Gabriel Synnaeve
Hervé Jégou
ViT
38
47
0
27 Dec 2021
SVIP: Sequence VerIfication for Procedures in Videos
SVIP: Sequence VerIfication for Procedures in Videos
Yichen Qian
Weixin Luo
Dongze Lian
Xu Tang
P. Zhao
Shenghua Gao
ViT
29
17
0
13 Dec 2021
3D Medical Point Transformer: Introducing Convolution to Attention
  Networks for Medical Point Cloud Analysis
3D Medical Point Transformer: Introducing Convolution to Attention Networks for Medical Point Cloud Analysis
Jianhui Yu
Chaoyi Zhang
Heng Wang
Dingxin Zhang
Yang Song
Tiange Xiang
Dongnan Liu
Weidong (Tom) Cai
ViT
MedIm
21
32
0
09 Dec 2021
Vision Pair Learning: An Efficient Training Framework for Image
  Classification
Vision Pair Learning: An Efficient Training Framework for Image Classification
Bei Tong
Xiaoyuan Yu
ViT
17
0
0
02 Dec 2021
CT-block: a novel local and global features extractor for point cloud
CT-block: a novel local and global features extractor for point cloud
Shangwei Guo
Jun Li
Zhengchao Lai
Xiantong Meng
Shaokun Han
ViT
3DPC
21
2
0
30 Nov 2021
An Image Patch is a Wave: Phase-Aware Vision MLP
An Image Patch is a Wave: Phase-Aware Vision MLP
Yehui Tang
Kai Han
Jianyuan Guo
Chang Xu
Yanxi Li
Chao Xu
Yunhe Wang
24
133
0
24 Nov 2021
Are Vision Transformers Robust to Patch Perturbations?
Are Vision Transformers Robust to Patch Perturbations?
Jindong Gu
Volker Tresp
Yao Qin
AAML
ViT
35
60
0
20 Nov 2021
12
Next