ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.13797
  4. Cited By
PVT v2: Improved Baselines with Pyramid Vision Transformer

PVT v2: Improved Baselines with Pyramid Vision Transformer

25 June 2021
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
    ViT
    AI4TS
ArXivPDFHTML

Papers citing "PVT v2: Improved Baselines with Pyramid Vision Transformer"

50 / 551 papers shown
Title
Agent Attention: On the Integration of Softmax and Linear Attention
Agent Attention: On the Integration of Softmax and Linear Attention
Dongchen Han
Tianzhu Ye
Yizeng Han
Zhuofan Xia
Siyuan Pan
Pengfei Wan
Shiji Song
Gao Huang
32
74
0
14 Dec 2023
Factorization Vision Transformer: Modeling Long Range Dependency with
  Local Window Cost
Factorization Vision Transformer: Modeling Long Range Dependency with Local Window Cost
Haolin Qin
Daquan Zhou
Tingfa Xu
Ziyang Bian
Jianan Li
29
9
0
14 Dec 2023
Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for
  Audio-Visual Segmentation
Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-Visual Segmentation
Qi Yang
Xing Nie
Tong Li
Pengfei Gao
Ying Guo
Cheng Zhen
Pengfei Yan
Shiming Xiang
VOS
34
13
0
11 Dec 2023
Spectrum-driven Mixed-frequency Network for Hyperspectral Salient Object
  Detection
Spectrum-driven Mixed-frequency Network for Hyperspectral Salient Object Detection
Peifu Liu
Tingfa Xu
Huan Chen
Shiyun Zhou
Haolin Qin
Jianan Li
19
8
0
02 Dec 2023
SCHEME: Scalable Channel Mixer for Vision Transformers
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
44
0
0
01 Dec 2023
SparseDC: Depth Completion from sparse and non-uniform inputs
SparseDC: Depth Completion from sparse and non-uniform inputs
Chen Long
Wenxiao Zhang
Zhe Chen
Haiping Wang
Yuan Liu
Zhen Cao
Zhen Dong
Bisheng Yang
MDE
30
8
0
30 Nov 2023
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
Dai Shi
ViT
23
76
0
28 Nov 2023
Cross-level Attention with Overlapped Windows for Camouflaged Object
  Detection
Cross-level Attention with Overlapped Windows for Camouflaged Object Detection
Jiepan Li
Fangxiao Lu
Nan Xue
Zhuo Li
Hongyan Zhang
Wei He
30
2
0
28 Nov 2023
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio,
  Video, Point Cloud, Time-Series and Image Recognition
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
Xiaohan Ding
Yiyuan Zhang
Yixiao Ge
Sijie Zhao
Lin Song
Xiangyu Yue
Ying Shan
VLM
AI4TS
SSL
29
101
0
27 Nov 2023
Advancing Vision Transformers with Group-Mix Attention
Advancing Vision Transformers with Group-Mix Attention
Chongjian Ge
Xiaohan Ding
Zhan Tong
Li Yuan
Jiangliu Wang
Yibing Song
Ping Luo
112
16
0
26 Nov 2023
All in One: RGB, RGB-D, and RGB-T Salient Object Detection
All in One: RGB, RGB-D, and RGB-T Salient Object Detection
Xingzhao Jia
Zhongqiu Zhao
Changlei Dongye
Zhao Zhang
32
2
0
23 Nov 2023
CMFDFormer: Transformer-based Copy-Move Forgery Detection with Continual
  Learning
CMFDFormer: Transformer-based Copy-Move Forgery Detection with Continual Learning
Yaqi Liu
Chao Xia
Song Xiao
Qingxiao Guan
Wenqian Dong
Yifan Zhang
Neng H. Yu
35
3
0
22 Nov 2023
HEViTPose: High-Efficiency Vision Transformer for Human Pose Estimation
HEViTPose: High-Efficiency Vision Transformer for Human Pose Estimation
Chengpeng Wu
Guangxing Tan
Chunyu Li
ViT
21
0
0
22 Nov 2023
Deep Tensor Network
Deep Tensor Network
Yifan Zhang
29
0
0
18 Nov 2023
Explicit Change Relation Learning for Change Detection in VHR Remote
  Sensing Images
Explicit Change Relation Learning for Change Detection in VHR Remote Sensing Images
Dalong Zheng
Zebin Wu
Jia-Wei Liu
Chih-Cheng Hung
Zhihui Wei
14
0
0
14 Nov 2023
Window Attention is Bugged: How not to Interpolate Position Embeddings
Window Attention is Bugged: How not to Interpolate Position Embeddings
Daniel Bolya
Chaitanya K. Ryali
Judy Hoffman
Christoph Feichtenhofer
43
10
0
09 Nov 2023
SBCFormer: Lightweight Network Capable of Full-size ImageNet
  Classification at 1 FPS on Single Board Computers
SBCFormer: Lightweight Network Capable of Full-size ImageNet Classification at 1 FPS on Single Board Computers
Xiangyong Lu
Masanori Suganuma
Takayuki Okatani
38
10
0
07 Nov 2023
Scattering Vision Transformer: Spectral Mixing Matters
Scattering Vision Transformer: Spectral Mixing Matters
Badri N. Patro
Vijay Srinivas Agneeswaran
34
14
0
02 Nov 2023
Distilling Knowledge from CNN-Transformer Models for Enhanced Human
  Action Recognition
Distilling Knowledge from CNN-Transformer Models for Enhanced Human Action Recognition
Hamid Ahmadabadi
Omid Nejati Manzari
Ahmad Ayatollahi
16
7
0
02 Nov 2023
Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked
  Autoencoders
Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders
Srijan Das
Tanmay Jain
Dominick Reilly
P. Balaji
Soumyajit Karmakar
Shyam Marjit
Xiang Li
Abhijit Das
Michael S. Ryoo
39
16
0
31 Oct 2023
ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object
  Detection
ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection
Youwei Pang
Xiaoqi Zhao
Tian-Zhu Xiang
Lihe Zhang
Huchuan Lu
19
24
0
31 Oct 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
44
36
0
30 Oct 2023
Pixel-Level Clustering Network for Unsupervised Image Segmentation
Pixel-Level Clustering Network for Unsupervised Image Segmentation
Cuong Manh Hoang
Byeongkeun Kang
SSeg
20
19
0
24 Oct 2023
G-CASCADE: Efficient Cascaded Graph Convolutional Decoding for 2D
  Medical Image Segmentation
G-CASCADE: Efficient Cascaded Graph Convolutional Decoding for 2D Medical Image Segmentation
Md Mostafijur Rahman
R. Marculescu
MedIm
29
33
0
24 Oct 2023
Minimalist and High-Performance Semantic Segmentation with Plain Vision
  Transformers
Minimalist and High-Performance Semantic Segmentation with Plain Vision Transformers
Yuanduo Hong
Jue Wang
Weichao Sun
Huihui Pan
VLM
ViT
37
7
0
19 Oct 2023
Medical Image Segmentation via Sparse Coding Decoder
Medical Image Segmentation via Sparse Coding Decoder
Long Zeng
Kaigui Wu
MedIm
26
3
0
17 Oct 2023
Multimodal Variational Auto-encoder based Audio-Visual Segmentation
Multimodal Variational Auto-encoder based Audio-Visual Segmentation
Yuxin Mao
Jing Zhang
Mochu Xiang
Yiran Zhong
Yuchao Dai
37
34
0
12 Oct 2023
Distilling Efficient Vision Transformers from CNNs for Semantic
  Segmentation
Distilling Efficient Vision Transformers from CNNs for Semantic Segmentation
Xueye Zheng
Yunhao Luo
Pengyuan Zhou
Lin Wang
35
13
0
11 Oct 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
34
4
0
10 Oct 2023
Cross-modal Cognitive Consensus guided Audio-Visual Segmentation
Cross-modal Cognitive Consensus guided Audio-Visual Segmentation
Zhaofeng Shi
Qingbo Wu
Fanman Meng
Linfeng Xu
Hongliang Li
VOS
30
3
0
10 Oct 2023
Plug n' Play: Channel Shuffle Module for Enhancing Tiny Vision
  Transformers
Plug n' Play: Channel Shuffle Module for Enhancing Tiny Vision Transformers
Xuwei Xu
Sen Wang
Yudong Chen
Jiajun Liu
ViT
21
1
0
09 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Low-Resolution Self-Attention for Semantic Segmentation
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
45
3
0
08 Oct 2023
LumiNet: The Bright Side of Perceptual Knowledge Distillation
LumiNet: The Bright Side of Perceptual Knowledge Distillation
Md. Ismail Hossain
M. M. L. Elahi
Sameera Ramasinghe
A. Cheraghian
Fuad Rahman
Nabeel Mohammed
Shafin Rahman
29
1
0
05 Oct 2023
SeisT: A foundational deep learning model for earthquake monitoring
  tasks
SeisT: A foundational deep learning model for earthquake monitoring tasks
Sen Li
Xu Yang
Anye Cao
Changbin Wang
Yaoqi Liu
Yapeng Liu
Qiang Niu
28
3
0
02 Oct 2023
PixArt-$α$: Fast Training of Diffusion Transformer for
  Photorealistic Text-to-Image Synthesis
PixArt-ααα: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Junsong Chen
Jincheng Yu
Chongjian Ge
Lewei Yao
Enze Xie
...
Zhongdao Wang
James T. Kwok
Ping Luo
Huchuan Lu
Zhenguo Li
DiffM
28
391
0
30 Sep 2023
UVL: A Unified Framework for Video Tampering Localization
UVL: A Unified Framework for Video Tampering Localization
Tingliang Feng
Xianfeng Zhao
Jinchuan Li
Yun Cao
AAML
21
0
0
28 Sep 2023
Beyond Grids: Exploring Elastic Input Sampling for Vision Transformers
Beyond Grids: Exploring Elastic Input Sampling for Vision Transformers
Adam Pardyl
Grzegorz Kurzejamski
Jan Olszewski
Tomasz Trzciñski
Bartosz Zieliñski
28
1
0
23 Sep 2023
UniHead: Unifying Multi-Perception for Detection Heads
UniHead: Unifying Multi-Perception for Detection Heads
Hantao Zhou
Rui Yang
Yachao Zhang
Haoran Duan
Yawen Huang
R. Hu
Xiu Li
Yefeng Zheng
31
12
0
23 Sep 2023
DualToken-ViT: Position-aware Efficient Vision Transformer with Dual
  Token Fusion
DualToken-ViT: Position-aware Efficient Vision Transformer with Dual Token Fusion
Zhenzhen Chu
Jiayu Chen
Cen Chen
Chengyu Wang
Ziheng Wu
Jun Huang
Weining Qian
ViT
13
2
0
21 Sep 2023
RMT: Retentive Networks Meet Vision Transformers
RMT: Retentive Networks Meet Vision Transformers
Qihang Fan
Huaibo Huang
Mingrui Chen
Hongmin Liu
Ran He
ViT
43
75
0
20 Sep 2023
Multi-level feature fusion network combining attention mechanisms for
  polyp segmentation
Multi-level feature fusion network combining attention mechanisms for polyp segmentation
Junzhuo Liu
Qiaosong Chen
Ye Zhang
Zihao Wang
Deng Xin
Jin Wang
20
19
0
19 Sep 2023
HiT: Building Mapping with Hierarchical Transformers
HiT: Building Mapping with Hierarchical Transformers
Mingming Zhang
Qingjie Liu
Yunhong Wang
ViT
31
6
0
18 Sep 2023
Discovering Sounding Objects by Audio Queries for Audio Visual
  Segmentation
Discovering Sounding Objects by Audio Queries for Audio Visual Segmentation
Shaofei Huang
Han Li
Yuqing Wang
Hongji Zhu
Jiao Dai
Jizhong Han
Wenge Rong
Si Liu
VOS
25
16
0
18 Sep 2023
Efficient Pyramid Channel Attention Network for Pathological Myopia
  Recognition
Efficient Pyramid Channel Attention Network for Pathological Myopia Recognition
Xiaoqing Zhang
Jilu Zhao
Yan Li
Hao Wu
Xiangtian Zhou
Jiang Liu
19
1
0
17 Sep 2023
MMST-ViT: Climate Change-aware Crop Yield Prediction via Multi-Modal
  Spatial-Temporal Vision Transformer
MMST-ViT: Climate Change-aware Crop Yield Prediction via Multi-Modal Spatial-Temporal Vision Transformer
Fudong Lin
Summer Crawford
Kaleb Guillot
Yihe Zhang
Yan Chen
...
Tri Setiyono
B. Tubana
Lu Peng
Magdy A. Bayoumi
N. Tzeng
42
20
0
16 Sep 2023
Rethinking Cross-Domain Pedestrian Detection: A Background-Focused
  Distribution Alignment Framework for Instance-Free One-Stage Detectors
Rethinking Cross-Domain Pedestrian Detection: A Background-Focused Distribution Alignment Framework for Instance-Free One-Stage Detectors
Yancheng Cai
Bo-Wen Zhang
Baopu Li
Tao Chen
Hongliang Yan
Jingdong Zhang
Jiahao Xu
ObjD
27
9
0
15 Sep 2023
Salient Object Detection in Optical Remote Sensing Images Driven by
  Transformer
Salient Object Detection in Optical Remote Sensing Images Driven by Transformer
Gongyang Li
Zhen Bai
Z. G. Liu
Xinpeng Zhang
Haibin Ling
31
41
0
15 Sep 2023
Co-Salient Object Detection with Semantic-Level Consensus Extraction and
  Dispersion
Co-Salient Object Detection with Semantic-Level Consensus Extraction and Dispersion
Peiran Xu
Yadong Mu
34
7
0
14 Sep 2023
Keep It SimPool: Who Said Supervised Transformers Suffer from Attention
  Deficit?
Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?
Bill Psomas
Ioannis Kakogeorgiou
Konstantinos Karantzalos
Yannis Avrithis
ViT
38
8
0
13 Sep 2023
Feature Aggregation Network for Building Extraction from High-resolution
  Remote Sensing Images
Feature Aggregation Network for Building Extraction from High-resolution Remote Sensing Images
Xuan Zhou
Xuefeng Wei
27
2
0
12 Sep 2023
Previous
123456...101112
Next