ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.13797
  4. Cited By
PVT v2: Improved Baselines with Pyramid Vision Transformer

PVT v2: Improved Baselines with Pyramid Vision Transformer

25 June 2021
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
    ViT
    AI4TS
ArXivPDFHTML

Papers citing "PVT v2: Improved Baselines with Pyramid Vision Transformer"

50 / 551 papers shown
Title
Sequential Transformer for End-to-End Person Search
Sequential Transformer for End-to-End Person Search
Long Chen
Jinhua Xu
ViT
21
4
0
06 Nov 2022
Strong-TransCenter: Improved Multi-Object Tracking based on Transformers
  with Dense Representations
Strong-TransCenter: Improved Multi-Object Tracking based on Transformers with Dense Representations
Amit Galor
Roy Orfaig
B. Bobrovsky
VOT
36
6
0
24 Oct 2022
Gallery Filter Network for Person Search
Gallery Filter Network for Person Search
Lucas Jaffe
A. Zakhor
18
12
0
24 Oct 2022
RTFormer: Efficient Design for Real-Time Semantic Segmentation with
  Transformer
RTFormer: Efficient Design for Real-Time Semantic Segmentation with Transformer
Jian Wang
Chen-xi Gou
Qiman Wu
Haocheng Feng
Junyu Han
Errui Ding
Jingdong Wang
ViT
30
95
0
13 Oct 2022
Bridging the Gap Between Vision Transformers and Convolutional Neural
  Networks on Small Datasets
Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets
Zhiying Lu
Hongtao Xie
Chuanbin Liu
Yongdong Zhang
ViT
25
57
0
12 Oct 2022
Memory transformers for full context and high-resolution 3D Medical
  Segmentation
Memory transformers for full context and high-resolution 3D Medical Segmentation
Loic Themyr
Clément Rambour
Nicolas Thome
Toby Collins
Alexandre Hostettler
ViT
MedIm
26
4
0
11 Oct 2022
Rethinking the Detection Head Configuration for Traffic Object Detection
Rethinking the Detection Head Configuration for Traffic Object Detection
Yi Shi
Jiang Wu
Shixuan Zhao
Gangyao Gao
T. Deng
Hongmei Yan
ObjD
24
5
0
08 Oct 2022
Revisiting Structured Dropout
Revisiting Structured Dropout
Yiren Zhao
Oluwatomisin Dada
Xitong Gao
Robert D. Mullins
BDL
16
2
0
05 Oct 2022
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision
  Models
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models
Chenglin Yang
Siyuan Qiao
Qihang Yu
Xiaoding Yuan
Yukun Zhu
Alan Yuille
Hartwig Adam
Liang-Chieh Chen
ViT
MoE
33
58
0
04 Oct 2022
Wide Attention Is The Way Forward For Transformers?
Wide Attention Is The Way Forward For Transformers?
Jason Brown
Yiren Zhao
Ilia Shumailov
Robert D. Mullins
21
7
0
02 Oct 2022
Exploiting More Information in Sparse Point Cloud for 3D Single Object
  Tracking
Exploiting More Information in Sparse Point Cloud for 3D Single Object Tracking
Yubo Cui
Jiayao Shan
Zuoxu Gu
Zhiheng Li
Zheng Fang
21
23
0
02 Oct 2022
MobileViTv3: Mobile-Friendly Vision Transformer with Simple and
  Effective Fusion of Local, Global and Input Features
MobileViTv3: Mobile-Friendly Vision Transformer with Simple and Effective Fusion of Local, Global and Input Features
S. Wadekar
Abhishek Chaurasia
ViT
98
87
0
30 Sep 2022
Multi-scale Attention Network for Single Image Super-Resolution
Multi-scale Attention Network for Single Image Super-Resolution
Yan Wang
Yusen Li
Gang Wang
Xiaoguang Liu
SupR
39
37
0
28 Sep 2022
Exploring the Relationship between Architecture and Adversarially Robust
  Generalization
Exploring the Relationship between Architecture and Adversarially Robust Generalization
Aishan Liu
Shiyu Tang
Siyuan Liang
Ruihao Gong
Boxi Wu
Xianglong Liu
Dacheng Tao
AAML
28
18
0
28 Sep 2022
SegNeXt: Rethinking Convolutional Attention Design for Semantic
  Segmentation
SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation
Meng-Hao Guo
Chenggang Lu
Qibin Hou
Zheng Liu
Ming-Ming Cheng
Shiyong Hu
SSeg
ViT
VLM
21
608
0
18 Sep 2022
DMFormer: Closing the Gap Between CNN and Vision Transformers
DMFormer: Closing the Gap Between CNN and Vision Transformers
Zimian Wei
H. Pan
Lujun Li
Menglong Lu
Xin-Yi Niu
Peijie Dong
Dongsheng Li
ViT
72
5
0
16 Sep 2022
UPAR: Unified Pedestrian Attribute Recognition and Person Retrieval
UPAR: Unified Pedestrian Attribute Recognition and Person Retrieval
Andreas Specker
Mickael Cormier
Jürgen Beyerer
CVBM
42
29
0
06 Sep 2022
LRT: An Efficient Low-Light Restoration Transformer for Dark Light Field
  Images
LRT: An Efficient Low-Light Restoration Transformer for Dark Light Field Images
Shansi Zhang
Nan Meng
E. Lam
ViT
39
20
0
06 Sep 2022
Spatial-Temporal Transformer for Video Snapshot Compressive Imaging
Spatial-Temporal Transformer for Video Snapshot Compressive Imaging
Lishun Wang
Miao Cao
Yong Zhong
Xin Yuan
19
47
0
04 Sep 2022
Domain Shift-oriented Machine Anomalous Sound Detection Model Based on
  Self-Supervised Learning
Domain Shift-oriented Machine Anomalous Sound Detection Model Based on Self-Supervised Learning
Jinghao Yan
Xin Wang
Qin Wang
Qin Qin
Huan Li
Pengyi Ye
Yue-ping He
Jing Zeng
34
1
0
31 Aug 2022
ELMformer: Efficient Raw Image Restoration with a Locally Multiplicative
  Transformer
ELMformer: Efficient Raw Image Restoration with a Locally Multiplicative Transformer
Jiaqi Ma
Shengyuan Yan
L. Zhang
Guoli Wang
Qian Zhang
33
8
0
31 Aug 2022
ClusTR: Exploring Efficient Self-attention via Clustering for Vision
  Transformers
ClusTR: Exploring Efficient Self-attention via Clustering for Vision Transformers
Yutong Xie
Jianpeng Zhang
Yong-quan Xia
Anton Van Den Hengel
Qi Wu
30
6
0
28 Aug 2022
Efficient Attention-free Video Shift Transformers
Efficient Attention-free Video Shift Transformers
Adrian Bulat
Brais Martínez
Georgios Tzimiropoulos
ViT
29
1
0
23 Aug 2022
FCN-Transformer Feature Fusion for Polyp Segmentation
FCN-Transformer Feature Fusion for Polyp Segmentation
Edward Sanderson
B. Matuszewski
ViT
MedIm
27
117
0
17 Aug 2022
Transformer Vs. MLP-Mixer: Exponential Expressive Gap For NLP Problems
Transformer Vs. MLP-Mixer: Exponential Expressive Gap For NLP Problems
D. Navon
A. Bronstein
MoE
38
0
0
17 Aug 2022
Task-Balanced Distillation for Object Detection
Task-Balanced Distillation for Object Detection
Ruining Tang
Zhen-yu Liu
Yangguang Li
Yiguo Song
Hui Liu
Qide Wang
Jing Shao
Guifang Duan
Jianrong Tan
26
20
0
05 Aug 2022
Behind Every Domain There is a Shift: Adapting Distortion-aware Vision
  Transformers for Panoramic Semantic Segmentation
Behind Every Domain There is a Shift: Adapting Distortion-aware Vision Transformers for Panoramic Semantic Segmentation
Jiaming Zhang
Kailun Yang
Haowen Shi
Simon Reiß
Kunyu Peng
Chaoxiang Ma
Haodong Fu
Philip H. S. Torr
Kaiwei Wang
Rainer Stiefelhagen
ViT
MDE
31
35
0
25 Jul 2022
3D Interacting Hand Pose Estimation by Hand De-occlusion and Removal
3D Interacting Hand Pose Estimation by Hand De-occlusion and Removal
Hao Meng
Sheng Jin
Wentao Liu
Chao Qian
Meng-Hsuan Lin
Wanli Ouyang
Ping Luo
3DH
14
41
0
22 Jul 2022
Locality Guidance for Improving Vision Transformers on Tiny Datasets
Locality Guidance for Improving Vision Transformers on Tiny Datasets
Kehan Li
Runyi Yu
Zhennan Wang
Li-ming Yuan
Guoli Song
Jie Chen
ViT
24
43
0
20 Jul 2022
Vision Transformers: From Semantic Segmentation to Dense Prediction
Vision Transformers: From Semantic Segmentation to Dense Prediction
Li Zhang
Jiachen Lu
Sixiao Zheng
Xinxuan Zhao
Xiatian Zhu
Yanwei Fu
Tao Xiang
Jianfeng Feng
Philip H. S. Torr
ViT
24
7
0
19 Jul 2022
Defect Transformer: An Efficient Hybrid Transformer Architecture for
  Surface Defect Detection
Defect Transformer: An Efficient Hybrid Transformer Architecture for Surface Defect Detection
Junpu Wang
Guili Xu
Fuju Yan
Jinjin Wang
Zhengsheng Wang
ViT
MedIm
26
66
0
17 Jul 2022
LightViT: Towards Light-Weight Convolution-Free Vision Transformers
LightViT: Towards Light-Weight Convolution-Free Vision Transformers
Tao Huang
Lang Huang
Shan You
Fei Wang
Chao Qian
Chang Xu
ViT
17
55
0
12 Jul 2022
Next-ViT: Next Generation Vision Transformer for Efficient Deployment in
  Realistic Industrial Scenarios
Next-ViT: Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial Scenarios
Jiashi Li
Xin Xia
W. Li
Huixia Li
Xing Wang
Xuefeng Xiao
Rui Wang
Min Zheng
Xin Pan
ViT
17
149
0
12 Jul 2022
Audio-Visual Segmentation
Audio-Visual Segmentation
Jinxing Zhou
Jianyuan Wang
Jingyang Zhang
Weixuan Sun
Jing Zhang
Stan Birchfield
Dan Guo
Lingpeng Kong
Meng Wang
Yiran Zhong
VOS
33
110
0
11 Jul 2022
Wave-ViT: Unifying Wavelet and Transformers for Visual Representation
  Learning
Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning
Ting Yao
Yingwei Pan
Yehao Li
Chong-Wah Ngo
Tao Mei
ViT
148
137
0
11 Jul 2022
Dual Vision Transformer
Dual Vision Transformer
Ting Yao
Yehao Li
Yingwei Pan
Yu Wang
Xiaoping Zhang
Tao Mei
ViT
141
75
0
11 Jul 2022
kMaX-DeepLab: k-means Mask Transformer
kMaX-DeepLab: k-means Mask Transformer
Qihang Yu
Huiyu Wang
Siyuan Qiao
Maxwell D. Collins
Yukun Zhu
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
ViT
35
18
0
08 Jul 2022
Semi-supervised Human Pose Estimation in Art-historical Images
Semi-supervised Human Pose Estimation in Art-historical Images
Matthias Springstein
Stefanie Schneider
C. Althaus
Ralph Ewerth
3DH
20
14
0
06 Jul 2022
OSFormer: One-Stage Camouflaged Instance Segmentation with Transformers
OSFormer: One-Stage Camouflaged Instance Segmentation with Transformers
Jialun Pei
Tianyang Cheng
Deng-Ping Fan
He Tang
Chuanbo Chen
Luc Van Gool
ViT
18
54
0
05 Jul 2022
Improving Semantic Segmentation in Transformers using Hierarchical
  Inter-Level Attention
Improving Semantic Segmentation in Transformers using Hierarchical Inter-Level Attention
Gary Leung
Jun Gao
Fangyin Wei
Sanja Fidler
21
3
0
05 Jul 2022
Softmax-free Linear Transformers
Softmax-free Linear Transformers
Jiachen Lu
Junge Zhang
Xiatian Zhu
Jianfeng Feng
Tao Xiang
Li Zhang
ViT
11
7
0
05 Jul 2022
TANet: Transformer-based Asymmetric Network for RGB-D Salient Object
  Detection
TANet: Transformer-based Asymmetric Network for RGB-D Salient Object Detection
Chang Liu
Gang Yang
Shuo Wang
Hangxu Wang
Yunhua Zhang
Yutao Wang
ViT
34
17
0
04 Jul 2022
HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object
  Detection
HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection
Tim Broedermann
Christos Sakaridis
Dengxin Dai
Luc Van Gool
52
31
0
30 Jun 2022
PVT-COV19D: Pyramid Vision Transformer for COVID-19 Diagnosis
PVT-COV19D: Pyramid Vision Transformer for COVID-19 Diagnosis
Lilang Zheng
Jiaxuan Fang
Xiaorun Tang
Hanzhang Li
Jiaxin Fan
Tianyi Wang
Rui Zhou
Zhaoyan Yan
ViT
MedIm
26
2
0
30 Jun 2022
Vicinity Vision Transformer
Vicinity Vision Transformer
Weixuan Sun
Zhen Qin
Huiyuan Deng
Jianyuan Wang
Yi Zhang
Kaihao Zhang
Nick Barnes
Stan Birchfield
Lingpeng Kong
Yiran Zhong
ViT
34
31
0
21 Jun 2022
Global Context Vision Transformers
Global Context Vision Transformers
Ali Hatamizadeh
Hongxu Yin
Greg Heinrich
Jan Kautz
Pavlo Molchanov
ViT
17
120
0
20 Jun 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary
  Algorithm
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
34
32
0
19 Jun 2022
RF-Next: Efficient Receptive Field Search for Convolutional Neural
  Networks
RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks
Shanghua Gao
Zhong-Yu Li
Qi Han
Ming-Ming Cheng
Liang Wang
32
34
0
14 Jun 2022
Kaggle Kinship Recognition Challenge: Introduction of Convolution-Free
  Model to boost conventional
Kaggle Kinship Recognition Challenge: Introduction of Convolution-Free Model to boost conventional
Mingchuan Tian
Guang-shou Teng
Yipeng Bao
ViT
18
0
0
11 Jun 2022
XBound-Former: Toward Cross-scale Boundary Modeling in Transformers
XBound-Former: Toward Cross-scale Boundary Modeling in Transformers
Jiacheng Wang
Fei Chen
Yuxi Ma
Liansheng Wang
Zhaodong Fei
Jia Shuai
Xiangdong Tang
Qichao Zhou
Jing Qin
ViT
MedIm
21
63
0
02 Jun 2022
Previous
123...10111289
Next