ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.12122
  4. Cited By
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions

Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions

24 February 2021
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
    ViT
ArXivPDFHTML

Papers citing "Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions"

50 / 604 papers shown
Title
A Theoretical Understanding of Shallow Vision Transformers: Learning,
  Generalization, and Sample Complexity
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
Hongkang Li
Hao Wu
Sijia Liu
Pin-Yu Chen
ViT
MLT
37
57
0
12 Feb 2023
Efficient Attention via Control Variates
Efficient Attention via Control Variates
Lin Zheng
Jianbo Yuan
Chong-Jun Wang
Lingpeng Kong
34
18
0
09 Feb 2023
AIM: Adapting Image Models for Efficient Video Action Recognition
AIM: Adapting Image Models for Efficient Video Action Recognition
Taojiannan Yang
Yi Zhu
Yusheng Xie
Aston Zhang
Cheng Chen
Mu Li
ViT
58
144
0
06 Feb 2023
AMD-HookNet for Glacier Front Segmentation
AMD-HookNet for Glacier Front Segmentation
Fei Wu
Nora Gourmelon
T. Seehaus
Jianlin Zhang
M. Braun
Andreas Maier
Vincent Christlein
24
9
0
06 Feb 2023
Out of Distribution Performance of State of Art Vision Model
Out of Distribution Performance of State of Art Vision Model
Salman Rahman
W. Lee
37
2
0
25 Jan 2023
Dynamic Grained Encoder for Vision Transformers
Dynamic Grained Encoder for Vision Transformers
Lin Song
Songyang Zhang
Songtao Liu
Zeming Li
Xuming He
Hongbin Sun
Jian Sun
Nanning Zheng
ViT
26
34
0
10 Jan 2023
DeMT: Deformable Mixer Transformer for Multi-Task Learning of Dense
  Prediction
DeMT: Deformable Mixer Transformer for Multi-Task Learning of Dense Prediction
Yang Yang
Yibo Yang
L. Zhang
ViT
33
51
0
09 Jan 2023
RGB-T Multi-Modal Crowd Counting Based on Transformer
RGB-T Multi-Modal Crowd Counting Based on Transformer
Zhengyi Liu
Wei Wu
liuzywen
Guanghui Zhang
ViT
18
11
0
08 Jan 2023
Skip-Attention: Improving Vision Transformers by Paying Less Attention
Skip-Attention: Improving Vision Transformers by Paying Less Attention
Shashanka Venkataramanan
Amir Ghodrati
Yuki M. Asano
Fatih Porikli
A. Habibian
ViT
18
25
0
05 Jan 2023
TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models
TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models
Sucheng Ren
Fangyun Wei
Zheng-Wei Zhang
Han Hu
40
34
0
03 Jan 2023
A New Perspective to Boost Vision Transformer for Medical Image
  Classification
A New Perspective to Boost Vision Transformer for Medical Image Classification
Yuexiang Li
Yawen Huang
Nanjun He
Kai Ma
Yefeng Zheng
ViT
MedIm
21
3
0
03 Jan 2023
Edge Enhanced Image Style Transfer via Transformers
Edge Enhanced Image Style Transfer via Transformers
Chi Zhang
Jun Yang
Zaiyan Dai
Peng-Xia Cao
16
10
0
02 Jan 2023
Multi-Stage Spatio-Temporal Aggregation Transformer for Video Person
  Re-identification
Multi-Stage Spatio-Temporal Aggregation Transformer for Video Person Re-identification
Ziyi Tang
Ruimao Zhang
Zhanglin Peng
Jinrui Chen
Liang Lin
33
18
0
02 Jan 2023
Pseudo-Inverted Bottleneck Convolution for DARTS Search Space
Pseudo-Inverted Bottleneck Convolution for DARTS Search Space
Arash Ahmadian
Louis S.P. Liu
Yue Fei
Konstantinos N. Plataniotis
Mahdi S. Hosseini
21
0
0
31 Dec 2022
Local Learning on Transformers via Feature Reconstruction
Local Learning on Transformers via Feature Reconstruction
P. Pathak
Jingwei Zhang
Dimitris Samaras
ViT
24
5
0
29 Dec 2022
Exploring Vision Transformers as Diffusion Learners
Exploring Vision Transformers as Diffusion Learners
He Cao
Jianan Wang
Tianhe Ren
Xianbiao Qi
Yihao Chen
Yuan Yao
L. Zhang
44
10
0
28 Dec 2022
Representation Separation for Semantic Segmentation with Vision
  Transformers
Representation Separation for Semantic Segmentation with Vision Transformers
Yuanduo Hong
Huihui Pan
Weichao Sun
Xinghu Yu
Huijun Gao
ViT
28
5
0
28 Dec 2022
MRTNet: Multi-Resolution Temporal Network for Video Sentence Grounding
MRTNet: Multi-Resolution Temporal Network for Video Sentence Grounding
Wei Ji
Long Chen
Yin-wei Wei
Yiming Wu
Tat-Seng Chua
AI4TS
29
18
0
26 Dec 2022
SMMix: Self-Motivated Image Mixing for Vision Transformers
SMMix: Self-Motivated Image Mixing for Vision Transformers
Mengzhao Chen
Mingbao Lin
Zhihang Lin
Yu-xin Zhang
Rongrong Ji
Rongrong Ji
53
10
0
26 Dec 2022
A Close Look at Spatial Modeling: From Attention to Convolution
A Close Look at Spatial Modeling: From Attention to Convolution
Xu Ma
Huan Wang
Can Qin
Kunpeng Li
Xing Zhao
Jie Fu
Yun Fu
ViT
3DPC
25
11
0
23 Dec 2022
DQnet: Cross-Model Detail Querying for Camouflaged Object Detection
DQnet: Cross-Model Detail Querying for Camouflaged Object Detection
Wei Sun
Chengao Liu
Linyan Zhang
Yu Li
Pengxu Wei
Chang-rui Liu
J. Zou
Jianbin Jiao
QiXiang Ye
48
6
0
16 Dec 2022
Rethinking Vision Transformers for MobileNet Size and Speed
Rethinking Vision Transformers for MobileNet Size and Speed
Yanyu Li
Ju Hu
Yang Wen
Georgios Evangelidis
Kamyar Salahi
Yanzhi Wang
Sergey Tulyakov
Jian Ren
ViT
35
159
0
15 Dec 2022
Full Contextual Attention for Multi-resolution Transformers in Semantic
  Segmentation
Full Contextual Attention for Multi-resolution Transformers in Semantic Segmentation
Loic Themyr
Clément Rambour
Nicolas Thome
Toby Collins
Alexandre Hostettler
ViT
27
10
0
15 Dec 2022
THMA: Tencent HD Map AI System for Creating HD Map Annotations
THMA: Tencent HD Map AI System for Creating HD Map Annotations
Kun Tang
Xu Cao
Zhipeng Cao
Tongxi Zhou
Erlong Li
...
Shengtao Zou
Chang-ling Liu
Shuqi Mei
Elena Sizikova
Chao Zheng
17
12
0
14 Dec 2022
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group
  Propagation
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
Chenhongyi Yang
Jiarui Xu
Shalini De Mello
Elliot J. Crowley
Xinyu Wang
ViT
38
21
0
13 Dec 2022
FastMIM: Expediting Masked Image Modeling Pre-training for Vision
FastMIM: Expediting Masked Image Modeling Pre-training for Vision
Jianyuan Guo
Kai Han
Han Wu
Yehui Tang
Yunhe Wang
Chang Xu
33
9
0
13 Dec 2022
Video Prediction by Efficient Transformers
Video Prediction by Efficient Transformers
Xi Ye
Guillaume-Alexandre Bilodeau
ViT
39
33
0
12 Dec 2022
ROIFormer: Semantic-Aware Region of Interest Transformer for Efficient
  Self-Supervised Monocular Depth Estimation
ROIFormer: Semantic-Aware Region of Interest Transformer for Efficient Self-Supervised Monocular Depth Estimation
Daitao Xing
Jinglin Shen
C. Ho
Anthony Tzes
ViT
MDE
31
4
0
12 Dec 2022
Position Embedding Needs an Independent Layer Normalization
Position Embedding Needs an Independent Layer Normalization
Runyi Yu
Zhennan Wang
Yinhuai Wang
Kehan Li
Yian Zhao
Jian Zhang
Guoli Song
Jie Chen
31
1
0
10 Dec 2022
CamoFormer: Masked Separable Attention for Camouflaged Object Detection
CamoFormer: Masked Separable Attention for Camouflaged Object Detection
Bo Yin
Xuying Zhang
Qibin Hou
Bo Sun
Deng-Ping Fan
Luc Van Gool
28
51
0
10 Dec 2022
Joint Spatio-Temporal Modeling for the Semantic Change Detection in
  Remote Sensing Images
Joint Spatio-Temporal Modeling for the Semantic Change Detection in Remote Sensing Images
L. Ding
Jing Zhang
Kai Zhang
Haitao Guo
Bing Liu
Lorenzo Bruzzone
23
47
0
10 Dec 2022
ViTPose++: Vision Transformer for Generic Body Pose Estimation
ViTPose++: Vision Transformer for Generic Body Pose Estimation
Yufei Xu
Jing Zhang
Qiming Zhang
Dacheng Tao
ViT
42
40
0
07 Dec 2022
Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video
  Learning
Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning
A. Piergiovanni
Weicheng Kuo
A. Angelova
ViT
36
54
0
06 Dec 2022
Vision Transformer Computation and Resilience for Dynamic Inference
Vision Transformer Computation and Resilience for Dynamic Inference
Kavya Sreedhar
Jason Clemons
Rangharajan Venkatesan
S. Keckler
M. Horowitz
26
2
0
06 Dec 2022
Joint Self-Supervised Image-Volume Representation Learning with
  Intra-Inter Contrastive Clustering
Joint Self-Supervised Image-Volume Representation Learning with Intra-Inter Contrastive Clustering
D. M. Nguyen
Hoangvu Nguyen
M. T. N. Truong
T. Cao
Binh Duc Nguyen
Nhat Ho
Paul Swoboda
Shadi Albarqouni
P. Xie
Daniel Sonntag
SSL
24
21
0
04 Dec 2022
Part-based Face Recognition with Vision Transformers
Part-based Face Recognition with Vision Transformers
Zhonglin Sun
Georgios Tzimiropoulos
ViT
25
15
0
30 Nov 2022
Finding Differences Between Transformers and ConvNets Using
  Counterfactual Simulation Testing
Finding Differences Between Transformers and ConvNets Using Counterfactual Simulation Testing
Nataniel Ruiz
Sarah Adel Bargal
Cihang Xie
Kate Saenko
Stan Sclaroff
ViT
36
5
0
29 Nov 2022
Lightweight Structure-Aware Attention for Visual Understanding
Lightweight Structure-Aware Attention for Visual Understanding
Heeseung Kwon
F. M. Castro
M. Marín-Jiménez
N. Guil
Alahari Karteek
28
2
0
29 Nov 2022
NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization
  for Vision Transformers
NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers
Yijiang Liu
Huanrui Yang
Zhen Dong
Kurt Keutzer
Li Du
Shanghang Zhang
MQ
31
46
0
29 Nov 2022
QuadFormer: Quadruple Transformer for Unsupervised Domain Adaptation in
  Power Line Segmentation of Aerial Images
QuadFormer: Quadruple Transformer for Unsupervised Domain Adaptation in Power Line Segmentation of Aerial Images
P. Rao
Feng Qiao
Weide Zhang
Yiliang Xu
Yong Deng
Guangbin Wu
Qiang Zhang
29
8
0
29 Nov 2022
Perceive, Ground, Reason, and Act: A Benchmark for General-purpose
  Visual Representation
Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation
Jiangyong Huang
William Zhu
Baoxiong Jia
Zan Wang
Xiaojian Ma
Qing Li
Siyuan Huang
37
5
0
28 Nov 2022
Medical Image Segmentation Review: The success of U-Net
Medical Image Segmentation Review: The success of U-Net
Reza Azad
Ehsan Khodapanah Aghdam
Amelie Rauland
Yiwei Jia
Atlas Haddadi Avval
Afshin Bozorgpour
Sanaz Karimijafarbigloo
Joseph Paul Cohen
Ehsan Adeli
Dorit Merhof
SSeg
25
265
0
27 Nov 2022
Semantic-Aware Local-Global Vision Transformer
Semantic-Aware Local-Global Vision Transformer
Jiatong Zhang
Zengwei Yao
Fanglin Chen
Guangming Lu
Wenjie Pei
ViT
25
0
0
27 Nov 2022
CMC v2: Towards More Accurate COVID-19 Detection with Discriminative
  Video Priors
CMC v2: Towards More Accurate COVID-19 Detection with Discriminative Video Priors
Junlin Hou
Jilan Xu
Nan Zhang
Yi Wang
Yuejie Zhang
Xuanyang Zhang
Rui Feng
29
2
0
26 Nov 2022
CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for
  Multi-Modality Image Fusion
CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion
Zixiang Zhao
Hao Bai
Jiangshe Zhang
Yulun Zhang
Shuang Xu
Zudi Lin
Radu Timofte
Luc Van Gool
37
309
0
26 Nov 2022
Degenerate Swin to Win: Plain Window-based Transformer without
  Sophisticated Operations
Degenerate Swin to Win: Plain Window-based Transformer without Sophisticated Operations
Tan Yu
Ping Li
ViT
46
5
0
25 Nov 2022
Spatial Mixture-of-Experts
Spatial Mixture-of-Experts
Nikoli Dryden
Torsten Hoefler
MoE
34
9
0
24 Nov 2022
EurNet: Efficient Multi-Range Relational Modeling of Spatial
  Multi-Relational Data
EurNet: Efficient Multi-Range Relational Modeling of Spatial Multi-Relational Data
Minghao Xu
Yuanfan Guo
Yi Xu
Jiangtao Tang
Xinlei Chen
Yuandong Tian
GNN
13
6
0
23 Nov 2022
Efficient Frequency Domain-based Transformers for High-Quality Image
  Deblurring
Efficient Frequency Domain-based Transformers for High-Quality Image Deblurring
Lingshun Kong
Jiangxin Dong
Mingqiang Li
J. Ge
Jin-shan Pan
ViT
32
142
0
22 Nov 2022
Uncertainty-aware Vision-based Metric Cross-view Geolocalization
Uncertainty-aware Vision-based Metric Cross-view Geolocalization
F. Fervers
Sebastian Bullinger
C. Bodensteiner
Michael Arens
Rainer Stiefelhagen
26
39
0
22 Nov 2022
Previous
123456...111213
Next