Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.13797
Cited By
PVT v2: Improved Baselines with Pyramid Vision Transformer
25 June 2021
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
AI4TS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"PVT v2: Improved Baselines with Pyramid Vision Transformer"
50 / 551 papers shown
Title
A High-Frequency Focused Network for Lightweight Single Image Super-Resolution
Xiaotian Weng
Yi Chen
Zhichao Zheng
Yanhui Gu
Junsheng Zhou
Yudong Zhang
23
0
0
21 Mar 2023
Robustifying Token Attention for Vision Transformers
Yong Guo
David Stutz
Bernt Schiele
ViT
21
24
0
20 Mar 2023
Towards Diverse Binary Segmentation via A Simple yet General Gated Network
Xiaoqi Zhao
Youwei Pang
Lihe Zhang
Huchuan Lu
Lei Zhang
28
14
0
18 Mar 2023
Resolution Enhancement Processing on Low Quality Images Using Swin Transformer Based on Interval Dense Connection Strategy
Ruikang Ju
Chih-Chia Chen
Jen-Shiun Chiang
Yu-Shian Lin
Wei-Han Chen
Chun-Tse Chien
27
17
0
16 Mar 2023
Large Selective Kernel Network for Remote Sensing Object Detection
Yuxuan Li
Qibin Hou
Zhaohui Zheng
Mingmei Cheng
Jian Yang
Xiang Li
ObjD
26
240
0
16 Mar 2023
BiFormer: Vision Transformer with Bi-Level Routing Attention
Lei Zhu
Xinjiang Wang
Zhanghan Ke
Wayne Zhang
Rynson W. H. Lau
131
480
0
15 Mar 2023
Guided Slot Attention for Unsupervised Video Object Segmentation
Minhyeok Lee
Suhwan Cho
Dogyoon Lee
Chaewon Park
Jungho Lee
Sangyoun Lee
VOS
58
10
0
15 Mar 2023
Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling
Yongshuai Huang
Ning Lu
Dapeng Chen
Yibo Li
Zecheng Xie
Shenggao Zhu
Liangcai Gao
Wei Peng
30
26
0
13 Mar 2023
Point Cloud Classification Using Content-based Transformer via Clustering in Feature Space
Yahui Liu
Bin Wang
Yisheng Lv
Lingxi Li
Feiyue Wang
ViT
3DPC
17
43
0
08 Mar 2023
Pyramid Pixel Context Adaption Network for Medical Image Classification with Supervised Contrastive Learning
Xiaoqin Zhang
Zunjie Xiao
Xiao Wu
Jiansheng Fang
Junyong Shen
Yan Hu
Jiang-Dong Liu
31
10
0
03 Mar 2023
Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation
Guozhen Zhang
Yuhan Zhu
Hongya Wang
Youxin Chen
Gangshan Wu
Limin Wang
71
84
0
01 Mar 2023
Memory-aided Contrastive Consensus Learning for Co-salient Object Detection
Peng Zheng
Jie Qin
Shuo Wang
Tian-Zhu Xiang
Huan Xiong
30
17
0
28 Feb 2023
MedViT: A Robust Vision Transformer for Generalized Medical Image Classification
Omid Nejati Manzari
Hamid Ahmadabadi
Hossein Kashiani
S. B. Shokouhi
Ahmad Ayatollahi
ViT
MedIm
31
177
0
19 Feb 2023
Efficient Attention via Control Variates
Lin Zheng
Jianbo Yuan
Chong-Jun Wang
Lingpeng Kong
31
18
0
09 Feb 2023
SwinCross: Cross-modal Swin Transformer for Head-and-Neck Tumor Segmentation in PET/CT Images
Gary Y. Li
Junyu Chen
Se-In Jang
Kuang Gong
Quanzheng Li
ViT
MedIm
46
14
0
08 Feb 2023
UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers
Dachuan Shi
Chaofan Tao
Ying Jin
Zhendong Yang
Chun Yuan
Jiaqi Wang
VLM
ViT
23
38
0
31 Jan 2023
Audio-Visual Segmentation with Semantics
Jinxing Zhou
Xuyang Shen
Jianyuan Wang
Jiayi Zhang
Weixuan Sun
...
Stan Birchfield
Dan Guo
Lingpeng Kong
Meng Wang
Yiran Zhong
VOS
46
37
0
30 Jan 2023
Out of Distribution Performance of State of Art Vision Model
Salman Rahman
W. Lee
37
2
0
25 Jan 2023
Champion Solution for the WSDM2023 Toloka VQA Challenge
Sheng Gao
Zhe Chen
Guo Chen
Wenhai Wang
Tong Lu
49
2
0
22 Jan 2023
FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer
Zhijian Liu
Xinyu Yang
Haotian Tang
Shang Yang
Song Han
29
64
0
20 Jan 2023
Poses of People in Art: A Data Set for Human Pose Estimation in Digital Art History
Stefanie Schneider
Ricarda Vollmer
3DH
30
5
0
12 Jan 2023
Head-Free Lightweight Semantic Segmentation with Linear Transformer
B. Dong
Pichao Wang
Fan Wang
ViT
22
66
0
11 Jan 2023
Dynamic Background Reconstruction via MAE for Infrared Small Target Detection
Jingchao Peng
Haitao Zhao
Kaijie Zhao
Zhongze Wang
Lujian Yao
14
2
0
11 Jan 2023
Vision Transformers Are Good Mask Auto-Labelers
Shiyi Lan
Xitong Yang
Zhiding Yu
Zuxuan Wu
J. Álvarez
Anima Anandkumar
ISeg
ViT
MedIm
24
19
0
10 Jan 2023
Rethinking Mobile Block for Efficient Attention-based Models
Jiangning Zhang
Xiangtai Li
Jian Li
Liang Liu
Zhucun Xue
Boshen Zhang
Zhe Jiang
Tianxin Huang
Yabiao Wang
Chengjie Wang
MQ
44
90
0
03 Jan 2023
Swin MAE: Masked Autoencoders for Small Datasets
Zián Xu
Yin Dai
Fayu Liu
Weibin Chen
Yue Liu
Li-Li Shi
Sheng Liu
Yuhang Zhou
SyDa
MedIm
ViT
36
28
0
28 Dec 2022
A Close Look at Spatial Modeling: From Attention to Convolution
Xu Ma
Huan Wang
Can Qin
Kunpeng Li
Xing Zhao
Jie Fu
Yun Fu
ViT
3DPC
25
11
0
23 Dec 2022
What Makes for Good Tokenizers in Vision Transformer?
Shengju Qian
Yi Zhu
Wenbo Li
Mu Li
Jiaya Jia
ViT
37
14
0
21 Dec 2022
DuAT: Dual-Aggregation Transformer Network for Medical Image Segmentation
Feilong Tang
Q. Huang
Jinfeng Wang
Xianxu Hou
Jionglong Su
Jingxin Liu
ViT
MedIm
32
49
0
21 Dec 2022
Full Contextual Attention for Multi-resolution Transformers in Semantic Segmentation
Loic Themyr
Clément Rambour
Nicolas Thome
Toby Collins
Alexandre Hostettler
ViT
27
10
0
15 Dec 2022
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
Chenhongyi Yang
Jiarui Xu
Shalini De Mello
Elliot J. Crowley
Xinyu Wang
ViT
38
21
0
13 Dec 2022
FastMIM: Expediting Masked Image Modeling Pre-training for Vision
Jianyuan Guo
Kai Han
Han Wu
Yehui Tang
Yunhe Wang
Chang Xu
33
9
0
13 Dec 2022
CamoFormer: Masked Separable Attention for Camouflaged Object Detection
Bo Yin
Xuying Zhang
Qibin Hou
Bo Sun
Deng-Ping Fan
Luc Van Gool
28
51
0
10 Dec 2022
BoxPolyp:Boost Generalized Polyp Segmentation Using Extra Coarse Bounding Box Annotations
JunChao Wei
Yiwen Hu
Guanbin Li
Shuguang Cui
S. Kevin Zhou
Z. Li
27
16
0
07 Dec 2022
IncepFormer: Efficient Inception Transformer with Pyramid Pooling for Semantic Segmentation
Lihua Fu
Haoyue Tian
Xiang Zhai
Pan Gao
Xiaojiang Peng
ViT
27
9
0
06 Dec 2022
Window Normalization: Enhancing Point Cloud Understanding by Unifying Inconsistent Point Densities
Qi Wang
Shengge Shi
Jiahui Li
Wuming Jiang
Xiangde Zhang
12
9
0
05 Dec 2022
Lightweight Structure-Aware Attention for Visual Understanding
Heeseung Kwon
F. M. Castro
M. Marín-Jiménez
N. Guil
Alahari Karteek
28
2
0
29 Nov 2022
Medical Image Segmentation Review: The success of U-Net
Reza Azad
Ehsan Khodapanah Aghdam
Amelie Rauland
Yiwei Jia
Atlas Haddadi Avval
Afshin Bozorgpour
Sanaz Karimijafarbigloo
Joseph Paul Cohen
Ehsan Adeli
Dorit Merhof
SSeg
22
265
0
27 Nov 2022
Spatial Mixture-of-Experts
Nikoli Dryden
Torsten Hoefler
MoE
34
9
0
24 Nov 2022
Fast-iTPN: Integrally Pre-Trained Transformer Pyramid Network with Token Migration
Yunjie Tian
Lingxi Xie
Jihao Qiu
Jianbin Jiao
Yaowei Wang
Qi Tian
Qixiang Ye
ViT
36
6
0
23 Nov 2022
Boundary-aware Camouflaged Object Detection via Deformable Point Sampling
Minhyeok Lee
Suhwan Cho
Chaewon Park
Dogyoon Lee
Jungho Lee
Sangyoun Lee
24
3
0
22 Nov 2022
Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention at Vision Transformer Inference
Haoran You
Yunyang Xiong
Xiaoliang Dai
Bichen Wu
Peizhao Zhang
Haoqi Fan
Peter Vajda
Yingyan Lin
35
31
0
18 Nov 2022
Fcaformer: Forward Cross Attention in Hybrid Vision Transformer
Haokui Zhang
Wenze Hu
Xiaoyu Wang
ViT
19
8
0
14 Nov 2022
BiViT: Extremely Compressed Binary Vision Transformer
Yefei He
Zhenyu Lou
Luoming Zhang
Jing Liu
Weijia Wu
Hong Zhou
Bohan Zhuang
ViT
MQ
20
28
0
14 Nov 2022
Interactive Context-Aware Network for RGB-T Salient Object Detection
Yuxuan Wang
Feng Dong
Jinchao Zhu
24
0
0
11 Nov 2022
Demystify Transformers & Convolutions in Modern Image Deep Networks
Jifeng Dai
Min Shi
Weiyun Wang
Sitong Wu
Linjie Xing
...
Lewei Lu
Jie Zhou
Xiaogang Wang
Yu Qiao
Xiao-hua Hu
ViT
26
11
0
10 Nov 2022
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
Wenhai Wang
Jifeng Dai
Zhe Chen
Zhenhang Huang
Zhiqi Li
...
Tong Lu
Lewei Lu
Hongsheng Li
Xiaogang Wang
Yu Qiao
VLM
36
657
0
10 Nov 2022
RadFormer: Transformers with Global-Local Attention for Interpretable and Accurate Gallbladder Cancer Detection
Soumen Basu
Mayank Gupta
Pratyaksha Rana
Pankaj Gupta
Chetan Arora
ViT
MedIm
18
32
0
09 Nov 2022
Efficient Joint Detection and Multiple Object Tracking with Spatially Aware Transformer
S. S. Nijhawan
Leo Hoshikawa
Atsushi Irie
Masakazu Yoshimura
Junji Otsuka
Takeshi Ohashi
VOT
ViT
29
0
0
09 Nov 2022
MogaNet: Multi-order Gated Aggregation Network
Siyuan Li
Zedong Wang
Zicheng Liu
Cheng Tan
Haitao Lin
Di Wu
Zhiyuan Chen
Jiangbin Zheng
Stan Z. Li
26
56
0
07 Nov 2022
Previous
1
2
3
...
10
11
12
7
8
9
Next