Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.09883
Cited By
v1
v2 (latest)
Swin Transformer V2: Scaling Up Capacity and Resolution
18 November 2021
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
Yixuan Wei
Jia Ning
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Github (14834★)
Papers citing
"Swin Transformer V2: Scaling Up Capacity and Resolution"
40 / 840 papers shown
Title
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
Jihao Liu
Xin Huang
Jinliang Zheng
Yu Liu
Hongsheng Li
59
55
0
26 May 2022
Vision Transformers in 2022: An Update on Tiny ImageNet
Ethan Huynh
ViT
86
11
0
21 May 2022
DProQ: A Gated-Graph Transformer for Protein Complex Structure Assessment
Xiao Chen
Alex Morehead
Jian Liu
Jianlin Cheng
59
7
0
21 May 2022
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality
Xiang Li
Wenhai Wang
Lingfeng Yang
Jian Yang
179
75
0
20 May 2022
Vision Transformer Adapter for Dense Predictions
Zhe Chen
Yuchen Duan
Wenhai Wang
Junjun He
Tong Lu
Jifeng Dai
Yu Qiao
158
571
0
17 May 2022
An Effective Transformer-based Solution for RSNA Intracranial Hemorrhage Detection Competition
Fangxin Shang
Siqi Wang
Xiaorong Wang
Yehui Yang
MedIm
34
2
0
16 May 2022
Sequencer: Deep LSTM for Image Classification
Yuki Tatsunami
Masato Taki
VLM
ViT
59
82
0
04 May 2022
Improving the Transferability of Adversarial Examples with Restructure Embedded Patches
Huipeng Zhou
Yu-an Tan
Yajie Wang
Haoran Lyu
Shan-Hung Wu
Yuan-zhang Li
ViT
60
4
0
27 Apr 2022
SUES-200: A Multi-height Multi-scene Cross-view Image Benchmark Across Drone and Satellite
Runzhe Zhu
Ling Yin
Mingze Yang
Fei Wu
Yunchen Yang
Wenbo Hu
67
54
0
22 Apr 2022
Diverse Imagenet Models Transfer Better
Niv Nayman
A. Golbert
Asaf Noy
Tan Ping
Lihi Zelnik-Manor
73
0
0
19 Apr 2022
VSA: Learning Varied-Size Window Attention in Vision Transformers
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
90
57
0
18 Apr 2022
ResT V2: Simpler, Faster and Stronger
Qing-Long Zhang
Yubin Yang
ViT
68
26
0
15 Apr 2022
S4OD: Semi-Supervised learning for Single-Stage Object Detection
Yueming Zhang
Xingxu Yao
Chao-Jung Liu
F. Chen
Xiaolin Song
Tengfei Xing
Runbo Hu
Hua Chai
Pengfei Xu
Guoshan Zhang
ObjD
64
7
0
09 Apr 2022
PP-LiteSeg: A Superior Real-Time Semantic Segmentation Model
Juncai Peng
Yi Liu
Shiyu Tang
Yuying Hao
Lutao Chu
...
Baohua Lai
Qiwen Liu
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
SSeg
VLM
91
148
0
06 Apr 2022
Exploring Plain Vision Transformer Backbones for Object Detection
Yanghao Li
Hanzi Mao
Ross B. Girshick
Kaiming He
ViT
106
818
0
30 Mar 2022
Focal Modulation Networks
Jianwei Yang
Chunyuan Li
Xiyang Dai
Lu Yuan
Jianfeng Gao
3DPC
109
279
0
22 Mar 2022
GroupTransNet: Group Transformer Network for RGB-D Salient Object Detection
Xian Fang
Jin-lei Zhu
Xiuli Shao
Hongpeng Wang
ViT
79
14
0
21 Mar 2022
simCrossTrans: A Simple Cross-Modality Transfer Learning for Object Detection with ConvNets or Vision Transformers
Xiaoke Shen
I. Stamos
ViT
31
5
0
20 Mar 2022
Open Set Recognition using Vision Transformer with an Additional Detection Head
Feiyang Cai
Zhenkai Zhang
Jie Liu
X. Koutsoukos
ViT
39
6
0
16 Mar 2022
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
Xiaohan Ding
Xinming Zhang
Yi Zhou
Jungong Han
Guiguang Ding
Jian Sun
VLM
151
554
0
13 Mar 2022
Active Token Mixer
Guoqiang Wei
Zhizheng Zhang
Cuiling Lan
Yan Lu
Zhibo Chen
57
15
0
11 Mar 2022
YouTube-GDD: A challenging gun detection dataset with rich contextual information
Yongxiang Gu
Xingbin Liao
Xiaolin Qin
27
8
0
08 Mar 2022
Dynamic Group Transformer: A General Vision Transformer Backbone with Dynamic Group Attention
Kai Liu
Tianyi Wu
Cong Liu
Guodong Guo
ViT
74
17
0
08 Mar 2022
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
Hao Zhang
Feng Li
Shilong Liu
Lei Zhang
Hang Su
Jun Zhu
L. Ni
H. Shum
ViT
202
1,477
0
07 Mar 2022
Fast Neural Architecture Search for Lightweight Dense Prediction Networks
Lam Huynh
Esa Rahtu
Juan E. Sala Matas
J. Heikkilä
68
2
0
03 Mar 2022
Aggregated Pyramid Vision Transformer: Split-transform-merge Strategy for Image Recognition without Convolutions
Ruikang Ju
Ting-Yu Lin
Jen-Shiun Chiang
Jia-Hao Jian
Yu-Shian Lin
Liu-Rui-Yi Huang
ViT
30
2
0
02 Mar 2022
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
ViT
118
235
0
21 Feb 2022
Context Autoencoder for Self-Supervised Representation Learning
Xiaokang Chen
Mingyu Ding
Xiaodi Wang
Ying Xin
Shentong Mo
Yunhao Wang
Shumin Han
Ping Luo
Gang Zeng
Jingdong Wang
SSL
171
400
0
07 Feb 2022
DKM: Dense Kernelized Feature Matching for Geometry Estimation
Johan Edstedt
Ioannis Athanasiadis
Mårten Wadenbäck
Michael Felsberg
3DV
MDE
131
129
0
01 Feb 2022
Vision-Based UAV Self-Positioning in Low-Altitude Urban Environments
Ming Dai
E. Zheng
Zhenhua Feng
Jiedong Zhuang
Wankou Yang
79
38
0
23 Jan 2022
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
141
107
0
16 Jan 2022
SeMask: Semantically Masked Transformers for Semantic Segmentation
Jitesh Jain
Anukriti Singh
Nikita Orlov
Zilong Huang
Jiachen Li
Steven Walton
Humphrey Shi
ViT
88
98
0
23 Dec 2021
Masked Feature Prediction for Self-Supervised Visual Pre-Training
Chen Wei
Haoqi Fan
Saining Xie
Chaoxia Wu
Alan Yuille
Christoph Feichtenhofer
ViT
181
672
0
16 Dec 2021
CPPE-5: Medical Personal Protective Equipment Dataset
Rishit Dagli
A. Shaikh
85
12
0
15 Dec 2021
SimMIM: A Simple Framework for Masked Image Modeling
Zhenda Xie
Zheng Zhang
Yue Cao
Yutong Lin
Jianmin Bao
Zhuliang Yao
Qi Dai
Han Hu
231
1,370
0
18 Nov 2021
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Hai-Tao Zheng
Li Tao
Dun Liang
Haitao Zheng
215
100
0
07 Nov 2021
Lightweight Monocular Depth with a Novel Neural Architecture Search Method
Lam Huynh
Phong H. Nguyen
Jirí Matas
Esa Rahtu
J. Heikkilä
70
10
0
25 Aug 2021
VOLO: Vision Outlooker for Visual Recognition
Li-xin Yuan
Qibin Hou
Zihang Jiang
Jiashi Feng
Shuicheng Yan
ViT
128
328
0
24 Jun 2021
Signal Transformer: Complex-valued Attention and Meta-Learning for Signal Recognition
Yihong Dong
Ying Peng
Muqiao Yang
Songtao Lu
Qingjiang Shi
103
9
0
05 Jun 2021
Visformer: The Vision-friendly Transformer
Zhengsu Chen
Lingxi Xie
Jianwei Niu
Xuefeng Liu
Longhui Wei
Qi Tian
ViT
209
223
0
26 Apr 2021
Previous
1
2
3
...
15
16
17