ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.12122
  4. Cited By
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions

Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions

24 February 2021
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
    ViT
ArXivPDFHTML

Papers citing "Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions"

50 / 604 papers shown
Title
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
44
36
0
30 Oct 2023
Zone Evaluation: Revealing Spatial Bias in Object Detection
Zone Evaluation: Revealing Spatial Bias in Object Detection
Zhaohui Zheng
Yuming Chen
Qibin Hou
Xiang Li
Ping Wang
Ming-Ming Cheng
ObjD
27
3
0
20 Oct 2023
Minimalist and High-Performance Semantic Segmentation with Plain Vision
  Transformers
Minimalist and High-Performance Semantic Segmentation with Plain Vision Transformers
Yuanduo Hong
Jue Wang
Weichao Sun
Huihui Pan
VLM
ViT
37
7
0
19 Oct 2023
Cross-attention Spatio-temporal Context Transformer for Semantic
  Segmentation of Historical Maps
Cross-attention Spatio-temporal Context Transformer for Semantic Segmentation of Historical Maps
Sidi Wu
Yizi Chen
Konrad Schindler
L. Hurni
26
2
0
19 Oct 2023
Medical Image Segmentation via Sparse Coding Decoder
Medical Image Segmentation via Sparse Coding Decoder
Long Zeng
Kaigui Wu
MedIm
29
3
0
17 Oct 2023
SeUNet-Trans: A Simple yet Effective UNet-Transformer Model for Medical
  Image Segmentation
SeUNet-Trans: A Simple yet Effective UNet-Transformer Model for Medical Image Segmentation
Tan-Hanh Pham
Xianqi Li
Kim-Doang Nguyen
MedIm
ViT
26
8
0
16 Oct 2023
Transformer-based Multimodal Change Detection with Multitask Consistency
  Constraints
Transformer-based Multimodal Change Detection with Multitask Consistency Constraints
Biyuan Liu
Huaixin Chen
Kun Li
Michael Ying Yang
33
14
0
13 Oct 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
34
4
0
10 Oct 2023
Anchor-Intermediate Detector: Decoupling and Coupling Bounding Boxes for
  Accurate Object Detection
Anchor-Intermediate Detector: Decoupling and Coupling Bounding Boxes for Accurate Object Detection
Yilong Lv
Min Li
Yujie He
Shaopeng Li
Zhuzhen He
Aitao Yang
26
1
0
09 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Low-Resolution Self-Attention for Semantic Segmentation
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
45
3
0
08 Oct 2023
UniHead: Unifying Multi-Perception for Detection Heads
UniHead: Unifying Multi-Perception for Detection Heads
Hantao Zhou
Rui Yang
Yachao Zhang
Haoran Duan
Yawen Huang
R. Hu
Xiu Li
Yefeng Zheng
31
12
0
23 Sep 2023
Associative Transformer
Associative Transformer
Yuwei Sun
H. Ochiai
Zhirong Wu
Stephen Lin
Ryota Kanai
ViT
60
0
0
22 Sep 2023
Leveraging the Power of Data Augmentation for Transformer-based Tracking
Leveraging the Power of Data Augmentation for Transformer-based Tracking
Jie Zhao
Johan Edstedt
M. Felsberg
D. Wang
Huchuan Lu
ViT
24
4
0
15 Sep 2023
Interpretability-Aware Vision Transformer
Interpretability-Aware Vision Transformer
Yao Qiang
Chengyin Li
Prashant Khanduri
D. Zhu
ViT
82
7
0
14 Sep 2023
CNN Injected Transformer for Image Exposure Correction
CNN Injected Transformer for Image Exposure Correction
Shuning Xu
Xiangyu Chen
Binbin Song
Jiantao Zhou
ViT
19
6
0
08 Sep 2023
DeViL: Decoding Vision features into Language
DeViL: Decoding Vision features into Language
Meghal Dani
Isabel Rio-Torto
Stephan Alaniz
Zeynep Akata
VLM
42
7
0
04 Sep 2023
Online Overexposed Pixels Hallucination in Videos with Adaptive
  Reference Frame Selection
Online Overexposed Pixels Hallucination in Videos with Adaptive Reference Frame Selection
Yazhou Xing
Amrita Mazumdar
Anjul Patney
Chao Liu
Hongxu Yin
Qifeng Chen
Jan Kautz
I. Frosio
49
1
0
29 Aug 2023
GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for
  Multi-Label Image Recognition
GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for Multi-Label Image Recognition
Ruijie Yao
Sheng Jin
Lumin Xu
Wang Zeng
Wentao Liu
Chao Qian
Ping Luo
Ji Wu
28
2
0
28 Aug 2023
Learning Heavily-Degraded Prior for Underwater Object Detection
Learning Heavily-Degraded Prior for Underwater Object Detection
C. Fu
Xin-Yue Fan
Jiewen Xiao
Wanqi Yuan
Risheng Liu
Zhongxuan Luo
24
22
0
24 Aug 2023
MGMAE: Motion Guided Masking for Video Masked Autoencoding
MGMAE: Motion Guided Masking for Video Masked Autoencoding
Bingkun Huang
Zhiyu Zhao
Guozhen Zhang
Yu Qiao
Limin Wang
39
30
0
21 Aug 2023
Improving FHB Screening in Wheat Breeding Using an Efficient Transformer
  Model
Improving FHB Screening in Wheat Breeding Using an Efficient Transformer Model
Babak Azad
A. Abdalla
Kwanghee Won
A. M. Nafchi
MedIm
30
2
0
07 Aug 2023
Dual Aggregation Transformer for Image Super-Resolution
Dual Aggregation Transformer for Image Super-Resolution
Zheng Chen
Yulun Zhang
Jinjin Gu
L. Kong
Xiaokang Yang
F. I. F. Richard Yu
ViT
22
167
0
07 Aug 2023
M2Former: Multi-Scale Patch Selection for Fine-Grained Visual
  Recognition
M2Former: Multi-Scale Patch Selection for Fine-Grained Visual Recognition
Ji-Hee Moon
Junseok K. Lee
Yu-Ling Lee
Seongsik Park
35
4
0
04 Aug 2023
PVG: Progressive Vision Graph for Vision Recognition
PVG: Progressive Vision Graph for Vision Recognition
Jiafu Wu
Jian Li
Jiangning Zhang
Boshen Zhang
M. Chi
Yabiao Wang
Chengjie Wang
ViT
25
12
0
01 Aug 2023
MCPA: Multi-scale Cross Perceptron Attention Network for 2D Medical
  Image Segmentation
MCPA: Multi-scale Cross Perceptron Attention Network for 2D Medical Image Segmentation
Liang Xu
Mingxi Chen
Yiyu Cheng
Pengfei Shao
Shuwei Shen
Peng Yao
Ronald X. Xu
ViT
32
0
0
27 Jul 2023
Visual Prompt Flexible-Modal Face Anti-Spoofing
Visual Prompt Flexible-Modal Face Anti-Spoofing
Zitong Yu
Rizhao Cai
Yawen Cui
Ajian Liu
Changsheng Chen
38
6
0
26 Jul 2023
Digital Modeling on Large Kernel Metamaterial Neural Network
Digital Modeling on Large Kernel Metamaterial Neural Network
Quan Liu
Hanyu Zheng
Brandon T. Swartz
Ho Hin Lee
Zuhayr Asad
I. Kravchenko
Jason G Valentine
Yuankai Huo
20
4
0
21 Jul 2023
Towards Saner Deep Image Registration
Towards Saner Deep Image Registration
Bin Duan
Ming Zhong
Yan Yan
MedIm
24
2
0
19 Jul 2023
Complementary Frequency-Varying Awareness Network for Open-Set Fine-Grained Image Recognition
Complementary Frequency-Varying Awareness Network for Open-Set Fine-Grained Image Recognition
Jiaying Sun
Hong Wang
Qiulei Dong
22
0
0
14 Jul 2023
UGCANet: A Unified Global Context-Aware Transformer-based Network with
  Feature Alignment for Endoscopic Image Analysis
UGCANet: A Unified Global Context-Aware Transformer-based Network with Feature Alignment for Endoscopic Image Analysis
Pham Vu Hung
N. Manh
Nguyen Thi Oanh
N. T. Thuy
D. V. Sang
ViT
MedIm
27
3
0
12 Jul 2023
Joint Perceptual Learning for Enhancement and Object Detection in
  Underwater Scenarios
Joint Perceptual Learning for Enhancement and Object Detection in Underwater Scenarios
C. Fu
Wanqi Yuan
Jiewen Xiao
Risheng Liu
Xin-Yue Fan
22
0
0
07 Jul 2023
HoughLaneNet: Lane Detection with Deep Hough Transform and Dynamic
  Convolution
HoughLaneNet: Lane Detection with Deep Hough Transform and Dynamic Convolution
Jia-Qi Zhang
Haoqi Duan
Jun-Long Chen
Ariel Shamir
Miao Wang
35
16
0
07 Jul 2023
All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment
All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment
Chunhui Zhang
Xin Sun
Li Liu
Yiqian Yang
Qiong Liu
Xiaoping Zhou
Yanfeng Wang
46
15
0
07 Jul 2023
MobileViG: Graph-Based Sparse Attention for Mobile Vision Applications
MobileViG: Graph-Based Sparse Attention for Mobile Vision Applications
Mustafa Munir
William Avery
R. Marculescu
ViT
GNN
34
33
0
01 Jul 2023
1M parameters are enough? A lightweight CNN-based model for medical
  image segmentation
1M parameters are enough? A lightweight CNN-based model for medical image segmentation
Binh-Duong Dinh
Thanh-Thu Nguyen
Thi-Thao Tran
Van-Truong Pham
SSeg
MedIm
25
16
0
28 Jun 2023
Efficient Deep Spiking Multi-Layer Perceptrons with Multiplication-Free
  Inference
Efficient Deep Spiking Multi-Layer Perceptrons with Multiplication-Free Inference
Boyan Li
Luziwei Leng
Shuaijie Shen
Kaixuan Zhang
Jianguo Zhang
Jianxing Liao
Ran Cheng
31
7
0
21 Jun 2023
Efficient Multi-Task Scene Analysis with RGB-D Transformers
Efficient Multi-Task Scene Analysis with RGB-D Transformers
Söhnke Benedikt Fischedick
Daniel Seichter
Robin M. Schmidt
Leonard Rabes
H. Groß
25
9
0
08 Jun 2023
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene
  Understanding
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding
Hanrong Ye
Dan Xu
ViT
29
10
0
08 Jun 2023
Auto-Spikformer: Spikformer Architecture Search
Auto-Spikformer: Spikformer Architecture Search
Kaiwei Che
Zhaokun Zhou
Zhengyu Ma
Wei Fang
Yanqing Chen
Shuaijie Shen
Liuliang Yuan
Yonghong Tian
29
4
0
01 Jun 2023
Lightweight Vision Transformer with Bidirectional Interaction
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
50
28
0
01 Jun 2023
VIPriors 3: Visual Inductive Priors for Data-Efficient Deep Learning
  Challenges
VIPriors 3: Visual Inductive Priors for Data-Efficient Deep Learning Challenges
Robert-Jan Bruintjes
A. Lengyel
Marcos Baptista-Rios
O. Kayhan
Davide Zambrano
Nergis Tomen
J. C. V. Gemert
25
9
0
31 May 2023
WinDB: HMD-free and Distortion-free Panoptic Video Fixation Learning
WinDB: HMD-free and Distortion-free Panoptic Video Fixation Learning
Guotao Wang
Chenglizhao Chen
Aimin Hao
Hong Qin
Deng-Ping Fan
32
0
0
23 May 2023
ColMix -- A Simple Data Augmentation Framework to Improve Object
  Detector Performance and Robustness in Aerial Images
ColMix -- A Simple Data Augmentation Framework to Improve Object Detector Performance and Robustness in Aerial Images
Cuong Ly
Grayson Jorgenson
D. R. D. Jesus
Henry Kvinge
A. Attarian
Y. Watkins
6
1
0
22 May 2023
HGFormer: Hierarchical Grouping Transformer for Domain Generalized
  Semantic Segmentation
HGFormer: Hierarchical Grouping Transformer for Domain Generalized Semantic Segmentation
Jian Ding
Nan Xue
Guisong Xia
Bernt Schiele
Dengxin Dai
ViT
17
30
0
22 May 2023
A bioinspired three-stage model for camouflaged object detection
A bioinspired three-stage model for camouflaged object detection
Tianyou Chen
Jin Xiao
Xiaoguang Hu
Guofeng Zhang
Shaojie Wang
32
0
0
22 May 2023
How Deep Learning Sees the World: A Survey on Adversarial Attacks &
  Defenses
How Deep Learning Sees the World: A Survey on Adversarial Attacks & Defenses
Joana Cabral Costa
Tiago Roxo
Hugo Manuel Proença
Pedro R. M. Inácio
AAML
37
50
0
18 May 2023
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
Chong Yu
Tao Chen
Zhongxue Gan
Jiayuan Fan
MQ
ViT
30
23
0
18 May 2023
EfficientSCI: Densely Connected Network with Space-time Factorization
  for Large-scale Video Snapshot Compressive Imaging
EfficientSCI: Densely Connected Network with Space-time Factorization for Large-scale Video Snapshot Compressive Imaging
Lishun Wang
Miao Cao
Xin Yuan
18
16
0
17 May 2023
CageViT: Convolutional Activation Guided Efficient Vision Transformer
CageViT: Convolutional Activation Guided Efficient Vision Transformer
Hao Zheng
Jinbao Wang
Xiantong Zhen
H. Chen
Jingkuan Song
Feng Zheng
ViT
20
0
0
17 May 2023
Enhancing the Performance of Transformer-based Spiking Neural Networks
  by SNN-optimized Downsampling with Precise Gradient Backpropagation
Enhancing the Performance of Transformer-based Spiking Neural Networks by SNN-optimized Downsampling with Precise Gradient Backpropagation
Chenlin Zhou
Han Zhang
Zhaokun Zhou
Liutao Yu
Zhengyu Ma
Huihui Zhou
Xiaopeng Fan
Yonghong Tian
26
9
0
10 May 2023
Previous
123456...111213
Next