Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.12122
Cited By
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
24 February 2021
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions"
50 / 604 papers shown
Title
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
44
36
0
30 Oct 2023
Zone Evaluation: Revealing Spatial Bias in Object Detection
Zhaohui Zheng
Yuming Chen
Qibin Hou
Xiang Li
Ping Wang
Ming-Ming Cheng
ObjD
27
3
0
20 Oct 2023
Minimalist and High-Performance Semantic Segmentation with Plain Vision Transformers
Yuanduo Hong
Jue Wang
Weichao Sun
Huihui Pan
VLM
ViT
37
7
0
19 Oct 2023
Cross-attention Spatio-temporal Context Transformer for Semantic Segmentation of Historical Maps
Sidi Wu
Yizi Chen
Konrad Schindler
L. Hurni
26
2
0
19 Oct 2023
Medical Image Segmentation via Sparse Coding Decoder
Long Zeng
Kaigui Wu
MedIm
29
3
0
17 Oct 2023
SeUNet-Trans: A Simple yet Effective UNet-Transformer Model for Medical Image Segmentation
Tan-Hanh Pham
Xianqi Li
Kim-Doang Nguyen
MedIm
ViT
26
8
0
16 Oct 2023
Transformer-based Multimodal Change Detection with Multitask Consistency Constraints
Biyuan Liu
Huaixin Chen
Kun Li
Michael Ying Yang
33
14
0
13 Oct 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
34
4
0
10 Oct 2023
Anchor-Intermediate Detector: Decoupling and Coupling Bounding Boxes for Accurate Object Detection
Yilong Lv
Min Li
Yujie He
Shaopeng Li
Zhuzhen He
Aitao Yang
26
1
0
09 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
45
3
0
08 Oct 2023
UniHead: Unifying Multi-Perception for Detection Heads
Hantao Zhou
Rui Yang
Yachao Zhang
Haoran Duan
Yawen Huang
R. Hu
Xiu Li
Yefeng Zheng
31
12
0
23 Sep 2023
Associative Transformer
Yuwei Sun
H. Ochiai
Zhirong Wu
Stephen Lin
Ryota Kanai
ViT
60
0
0
22 Sep 2023
Leveraging the Power of Data Augmentation for Transformer-based Tracking
Jie Zhao
Johan Edstedt
M. Felsberg
D. Wang
Huchuan Lu
ViT
24
4
0
15 Sep 2023
Interpretability-Aware Vision Transformer
Yao Qiang
Chengyin Li
Prashant Khanduri
D. Zhu
ViT
82
7
0
14 Sep 2023
CNN Injected Transformer for Image Exposure Correction
Shuning Xu
Xiangyu Chen
Binbin Song
Jiantao Zhou
ViT
19
6
0
08 Sep 2023
DeViL: Decoding Vision features into Language
Meghal Dani
Isabel Rio-Torto
Stephan Alaniz
Zeynep Akata
VLM
42
7
0
04 Sep 2023
Online Overexposed Pixels Hallucination in Videos with Adaptive Reference Frame Selection
Yazhou Xing
Amrita Mazumdar
Anjul Patney
Chao Liu
Hongxu Yin
Qifeng Chen
Jan Kautz
I. Frosio
49
1
0
29 Aug 2023
GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for Multi-Label Image Recognition
Ruijie Yao
Sheng Jin
Lumin Xu
Wang Zeng
Wentao Liu
Chao Qian
Ping Luo
Ji Wu
28
2
0
28 Aug 2023
Learning Heavily-Degraded Prior for Underwater Object Detection
C. Fu
Xin-Yue Fan
Jiewen Xiao
Wanqi Yuan
Risheng Liu
Zhongxuan Luo
24
22
0
24 Aug 2023
MGMAE: Motion Guided Masking for Video Masked Autoencoding
Bingkun Huang
Zhiyu Zhao
Guozhen Zhang
Yu Qiao
Limin Wang
39
30
0
21 Aug 2023
Improving FHB Screening in Wheat Breeding Using an Efficient Transformer Model
Babak Azad
A. Abdalla
Kwanghee Won
A. M. Nafchi
MedIm
30
2
0
07 Aug 2023
Dual Aggregation Transformer for Image Super-Resolution
Zheng Chen
Yulun Zhang
Jinjin Gu
L. Kong
Xiaokang Yang
F. I. F. Richard Yu
ViT
22
167
0
07 Aug 2023
M2Former: Multi-Scale Patch Selection for Fine-Grained Visual Recognition
Ji-Hee Moon
Junseok K. Lee
Yu-Ling Lee
Seongsik Park
35
4
0
04 Aug 2023
PVG: Progressive Vision Graph for Vision Recognition
Jiafu Wu
Jian Li
Jiangning Zhang
Boshen Zhang
M. Chi
Yabiao Wang
Chengjie Wang
ViT
25
12
0
01 Aug 2023
MCPA: Multi-scale Cross Perceptron Attention Network for 2D Medical Image Segmentation
Liang Xu
Mingxi Chen
Yiyu Cheng
Pengfei Shao
Shuwei Shen
Peng Yao
Ronald X. Xu
ViT
32
0
0
27 Jul 2023
Visual Prompt Flexible-Modal Face Anti-Spoofing
Zitong Yu
Rizhao Cai
Yawen Cui
Ajian Liu
Changsheng Chen
38
6
0
26 Jul 2023
Digital Modeling on Large Kernel Metamaterial Neural Network
Quan Liu
Hanyu Zheng
Brandon T. Swartz
Ho Hin Lee
Zuhayr Asad
I. Kravchenko
Jason G Valentine
Yuankai Huo
20
4
0
21 Jul 2023
Towards Saner Deep Image Registration
Bin Duan
Ming Zhong
Yan Yan
MedIm
24
2
0
19 Jul 2023
Complementary Frequency-Varying Awareness Network for Open-Set Fine-Grained Image Recognition
Jiaying Sun
Hong Wang
Qiulei Dong
22
0
0
14 Jul 2023
UGCANet: A Unified Global Context-Aware Transformer-based Network with Feature Alignment for Endoscopic Image Analysis
Pham Vu Hung
N. Manh
Nguyen Thi Oanh
N. T. Thuy
D. V. Sang
ViT
MedIm
27
3
0
12 Jul 2023
Joint Perceptual Learning for Enhancement and Object Detection in Underwater Scenarios
C. Fu
Wanqi Yuan
Jiewen Xiao
Risheng Liu
Xin-Yue Fan
22
0
0
07 Jul 2023
HoughLaneNet: Lane Detection with Deep Hough Transform and Dynamic Convolution
Jia-Qi Zhang
Haoqi Duan
Jun-Long Chen
Ariel Shamir
Miao Wang
35
16
0
07 Jul 2023
All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment
Chunhui Zhang
Xin Sun
Li Liu
Yiqian Yang
Qiong Liu
Xiaoping Zhou
Yanfeng Wang
46
15
0
07 Jul 2023
MobileViG: Graph-Based Sparse Attention for Mobile Vision Applications
Mustafa Munir
William Avery
R. Marculescu
ViT
GNN
34
33
0
01 Jul 2023
1M parameters are enough? A lightweight CNN-based model for medical image segmentation
Binh-Duong Dinh
Thanh-Thu Nguyen
Thi-Thao Tran
Van-Truong Pham
SSeg
MedIm
25
16
0
28 Jun 2023
Efficient Deep Spiking Multi-Layer Perceptrons with Multiplication-Free Inference
Boyan Li
Luziwei Leng
Shuaijie Shen
Kaixuan Zhang
Jianguo Zhang
Jianxing Liao
Ran Cheng
31
7
0
21 Jun 2023
Efficient Multi-Task Scene Analysis with RGB-D Transformers
Söhnke Benedikt Fischedick
Daniel Seichter
Robin M. Schmidt
Leonard Rabes
H. Groß
25
9
0
08 Jun 2023
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding
Hanrong Ye
Dan Xu
ViT
29
10
0
08 Jun 2023
Auto-Spikformer: Spikformer Architecture Search
Kaiwei Che
Zhaokun Zhou
Zhengyu Ma
Wei Fang
Yanqing Chen
Shuaijie Shen
Liuliang Yuan
Yonghong Tian
29
4
0
01 Jun 2023
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
50
28
0
01 Jun 2023
VIPriors 3: Visual Inductive Priors for Data-Efficient Deep Learning Challenges
Robert-Jan Bruintjes
A. Lengyel
Marcos Baptista-Rios
O. Kayhan
Davide Zambrano
Nergis Tomen
J. C. V. Gemert
25
9
0
31 May 2023
WinDB: HMD-free and Distortion-free Panoptic Video Fixation Learning
Guotao Wang
Chenglizhao Chen
Aimin Hao
Hong Qin
Deng-Ping Fan
32
0
0
23 May 2023
ColMix -- A Simple Data Augmentation Framework to Improve Object Detector Performance and Robustness in Aerial Images
Cuong Ly
Grayson Jorgenson
D. R. D. Jesus
Henry Kvinge
A. Attarian
Y. Watkins
6
1
0
22 May 2023
HGFormer: Hierarchical Grouping Transformer for Domain Generalized Semantic Segmentation
Jian Ding
Nan Xue
Guisong Xia
Bernt Schiele
Dengxin Dai
ViT
17
30
0
22 May 2023
A bioinspired three-stage model for camouflaged object detection
Tianyou Chen
Jin Xiao
Xiaoguang Hu
Guofeng Zhang
Shaojie Wang
32
0
0
22 May 2023
How Deep Learning Sees the World: A Survey on Adversarial Attacks & Defenses
Joana Cabral Costa
Tiago Roxo
Hugo Manuel Proença
Pedro R. M. Inácio
AAML
37
50
0
18 May 2023
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
Chong Yu
Tao Chen
Zhongxue Gan
Jiayuan Fan
MQ
ViT
30
23
0
18 May 2023
EfficientSCI: Densely Connected Network with Space-time Factorization for Large-scale Video Snapshot Compressive Imaging
Lishun Wang
Miao Cao
Xin Yuan
18
16
0
17 May 2023
CageViT: Convolutional Activation Guided Efficient Vision Transformer
Hao Zheng
Jinbao Wang
Xiantong Zhen
H. Chen
Jingkuan Song
Feng Zheng
ViT
20
0
0
17 May 2023
Enhancing the Performance of Transformer-based Spiking Neural Networks by SNN-optimized Downsampling with Precise Gradient Backpropagation
Chenlin Zhou
Han Zhang
Zhaokun Zhou
Liutao Yu
Zhengyu Ma
Huihui Zhou
Xiaopeng Fan
Yonghong Tian
26
9
0
10 May 2023
Previous
1
2
3
4
5
6
...
11
12
13
Next