ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.09883
  4. Cited By
Swin Transformer V2: Scaling Up Capacity and Resolution
v1v2 (latest)

Swin Transformer V2: Scaling Up Capacity and Resolution

18 November 2021
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
Yixuan Wei
Jia Ning
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
    ViT
ArXiv (abs)PDFHTMLGithub (14834★)

Papers citing "Swin Transformer V2: Scaling Up Capacity and Resolution"

50 / 840 papers shown
Title
Image as a Foreign Language: BEiT Pretraining for All Vision and
  Vision-Language Tasks
Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks
Wenhui Wang
Hangbo Bao
Li Dong
Johan Bjorck
Zhiliang Peng
...
Kriti Aggarwal
O. Mohammed
Saksham Singhal
Subhojit Som
Furu Wei
MLLMVLMViT
157
645
0
22 Aug 2022
Conv-Adapter: Exploring Parameter Efficient Transfer Learning for
  ConvNets
Conv-Adapter: Exploring Parameter Efficient Transfer Learning for ConvNets
Hao Chen
R. Tao
Han Zhang
Yidong Wang
Xiang Li
Weirong Ye
Jindong Wang
Guosheng Hu
Marios Savvides
VPVLM
116
57
0
15 Aug 2022
BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers
BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers
Zhiliang Peng
Li Dong
Hangbo Bao
QiXiang Ye
Furu Wei
71
322
0
12 Aug 2022
Advancing Plain Vision Transformer Towards Remote Sensing Foundation
  Model
Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model
Di Wang
Qiming Zhang
Yufei Xu
Jing Zhang
Bo Du
Dacheng Tao
Lefei Zhang
84
257
0
08 Aug 2022
P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with
  Point-to-Pixel Prompting
P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with Point-to-Pixel Prompting
Ziyi Wang
Xumin Yu
Yongming Rao
Jie Zhou
Jiwen Lu
VPVLMVLM
93
77
0
04 Aug 2022
Unified Normalization for Accelerating and Stabilizing Transformers
Unified Normalization for Accelerating and Stabilizing Transformers
Qiming Yang
Kai Zhang
Chaoxiang Lan
Zhi Yang
Zheyang Li
Wenming Tan
Jun Xiao
Shiliang Pu
70
8
0
02 Aug 2022
giMLPs: Gate with Inhibition Mechanism in MLPs
Cheng Kang
Jindich Prokop
Lei Tong
Huiyu Zhou
Yong Hu
Daneil Novak
33
0
0
01 Aug 2022
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated
  Convolutions
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions
Yongming Rao
Wenliang Zhao
Yansong Tang
Jie Zhou
Ser-Nam Lim
Jiwen Lu
ViT
113
256
0
28 Jul 2022
Visual Recognition by Request
Visual Recognition by Request
Chufeng Tang
Lingxi Xie
Xiaopeng Zhang
Xiaolin Hu
Qi Tian
VLM
86
15
0
28 Jul 2022
PEA: Improving the Performance of ReLU Networks for Free by Using
  Progressive Ensemble Activations
PEA: Improving the Performance of ReLU Networks for Free by Using Progressive Ensemble Activations
Á. Utasi
49
0
0
28 Jul 2022
Multi-Forgery Detection Challenge 2022: Push the Frontier of
  Unconstrained and Diverse Forgery Detection
Multi-Forgery Detection Challenge 2022: Push the Frontier of Unconstrained and Diverse Forgery Detection
Jianshu Li
Man Luo
Jian Liu
Tao Chen
Chengjie Wang
...
Bo Liu
Mingyu Guo
Ying Guo
Y. Ao
Pengfei Gao
25
0
0
27 Jul 2022
Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment
Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment
Qiang Chen
Xiaokang Chen
Jian Wang
Shan Zhang
Kun Yao
Haocheng Feng
Junyu Han
Errui Ding
Gang Zeng
Jingdong Wang
ViT
143
135
0
26 Jul 2022
DETRs with Hybrid Matching
DETRs with Hybrid Matching
Ding Jia
Yuhui Yuan
Hao He
Xiao-pei Wu
Haojun Yu
Weihong Lin
Lei-huan Sun
Chao Zhang
Hanhua Hu
69
198
0
26 Jul 2022
Cross-Modal Causal Relational Reasoning for Event-Level Visual Question
  Answering
Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering
Yang Liu
Guanbin Li
Liang Lin
LRM
159
87
0
26 Jul 2022
Dive into Big Model Training
Dive into Big Model Training
Qinghua Liu
Yuxiang Jiang
MoMeAI4CELRM
33
3
0
25 Jul 2022
Applying Spatiotemporal Attention to Identify Distracted and Drowsy
  Driving with Vision Transformers
Applying Spatiotemporal Attention to Identify Distracted and Drowsy Driving with Vision Transformers
Samay Lakhani
ViTMedIm
40
2
0
22 Jul 2022
Efficient Graph-Friendly COCO Metric Computation for Train-Time Model
  Evaluation
Efficient Graph-Friendly COCO Metric Computation for Train-Time Model Evaluation
Luke Wood
François Chollet
26
7
0
21 Jul 2022
TinyViT: Fast Pretraining Distillation for Small Vision Transformers
TinyViT: Fast Pretraining Distillation for Small Vision Transformers
Kan Wu
Jinnian Zhang
Houwen Peng
Mengchen Liu
Bin Xiao
Jianlong Fu
Lu Yuan
ViT
74
267
0
21 Jul 2022
Vision Transformers: From Semantic Segmentation to Dense Prediction
Vision Transformers: From Semantic Segmentation to Dense Prediction
Li Zhang
Jiachen Lu
Sixiao Zheng
Xinxuan Zhao
Xiatian Zhu
Yanwei Fu
Tao Xiang
Jianfeng Feng
Philip H. S. Torr
ViT
99
8
0
19 Jul 2022
Towards Trustworthy Healthcare AI: Attention-Based Feature Learning for
  COVID-19 Screening With Chest Radiography
Towards Trustworthy Healthcare AI: Attention-Based Feature Learning for COVID-19 Screening With Chest Radiography
Kai Ma
Pengcheng Xi
K. Habashy
Ashkan Ebadi
Stéphane Tremblay
Alexander Wong
ViTMedIm
31
1
0
19 Jul 2022
MonoIndoor++:Towards Better Practice of Self-Supervised Monocular Depth
  Estimation for Indoor Environments
MonoIndoor++:Towards Better Practice of Self-Supervised Monocular Depth Estimation for Indoor Environments
Runze Li
Pan Ji
Yi Tian Xu
B. Bhanu
MDE
64
23
0
18 Jul 2022
Multi-manifold Attention for Vision Transformers
Multi-manifold Attention for Vision Transformers
D. Konstantinidis
Ilias Papastratis
K. Dimitropoulos
P. Daras
ViT
94
16
0
18 Jul 2022
Current Trends in Deep Learning for Earth Observation: An Open-source
  Benchmark Arena for Image Classification
Current Trends in Deep Learning for Earth Observation: An Open-source Benchmark Arena for Image Classification
I. Dimitrovski
Ivan Kitanovski
D. Kocev
Nikola Simidjievski
VLM
94
78
0
14 Jul 2022
Rethinking Attention Mechanism in Time Series Classification
Rethinking Attention Mechanism in Time Series Classification
Bowen Zhao
Huanlai Xing
Xinhan Wang
Fuhong Song
Zhiwen Xiao
AI4TS
53
35
0
14 Jul 2022
Pyramid Transformer for Traffic Sign Detection
Pyramid Transformer for Traffic Sign Detection
Omid Nejati Manzari
A. Boudesh
S. B. Shokouhi
ViT
67
12
0
13 Jul 2022
MSP-Former: Multi-Scale Projection Transformer for Single Image
  Desnowing
MSP-Former: Multi-Scale Projection Transformer for Single Image Desnowing
Sixiang Chen
Tian-Chun Ye
Yun-Peng Liu
Taodong Liao
Y. Ye
Erkang Chen
Peng Chen
ViT
125
54
0
12 Jul 2022
YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for
  real-time object detectors
YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
Chien-Yao Wang
Alexey Bochkovskiy
H. Liao
ObjD
177
6,636
0
06 Jul 2022
Softmax-free Linear Transformers
Softmax-free Linear Transformers
Jiachen Lu
Junge Zhang
Xiatian Zhu
Jianfeng Feng
Tao Xiang
Li Zhang
ViT
54
8
0
05 Jul 2022
Spatiotemporal Feature Learning Based on Two-Step LSTM and Transformer
  for CT Scans
Spatiotemporal Feature Learning Based on Two-Step LSTM and Transformer for CT Scans
Chih-Chung Hsu
Chin-Han Tsai
Guangfeng Chen
Sin-Di Ma
Shen-Chieh Tai
MedIm
48
9
0
04 Jul 2022
Woodscape Fisheye Object Detection for Autonomous Driving -- CVPR 2022
  OmniCV Workshop Challenge
Woodscape Fisheye Object Detection for Autonomous Driving -- CVPR 2022 OmniCV Workshop Challenge
Saravanabalagi Ramachandran
Ganesh Sistu
V. Kumar
J. McDonald
S. Yogamani
73
5
0
26 Jun 2022
LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs
LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs
Yukang Chen
Jianhui Liu
Xinming Zhang
Xiaojuan Qi
Jiaya Jia
124
90
0
21 Jun 2022
HOPE: Hierarchical Spatial-temporal Network for Occupancy Flow
  Prediction
HOPE: Hierarchical Spatial-temporal Network for Occupancy Flow Prediction
Yi Hu
Wenxin Shao
Bo Jiang
Jiajie Chen
Siqi Chai
Zhening Yang
Jingyu Qian
Helong Zhou
Qiang Liu
AI4CE
76
14
0
21 Jun 2022
Global Context Vision Transformers
Global Context Vision Transformers
Ali Hatamizadeh
Hongxu Yin
Greg Heinrich
Jan Kautz
Pavlo Molchanov
ViT
76
129
0
20 Jun 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary
  Algorithm
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
121
35
0
19 Jun 2022
Enhanced Bi-directional Motion Estimation for Video Frame Interpolation
Enhanced Bi-directional Motion Estimation for Video Frame Interpolation
Xin Jin
Longhai Wu
Guotao Shen
Youxin Chen
Jie Chen
Jayoon Koo
Cheul-hee Hahm
52
23
0
17 Jun 2022
Rectify ViT Shortcut Learning by Visual Saliency
Rectify ViT Shortcut Learning by Visual Saliency
Chong Ma
Lin Zhao
Yuzhong Chen
David Liu
Xi Jiang
Tuo Zhang
Xintao Hu
Dinggang Shen
Dajiang Zhu
Tianming Liu
ViT
108
20
0
17 Jun 2022
OmniMAE: Single Model Masked Pretraining on Images and Videos
OmniMAE: Single Model Masked Pretraining on Images and Videos
Rohit Girdhar
Alaaeldin El-Nouby
Mannat Singh
Kalyan Vasudev Alwala
Armand Joulin
Ishan Misra
ViT
108
99
0
16 Jun 2022
ChordMixer: A Scalable Neural Attention Model for Sequences with
  Different Lengths
ChordMixer: A Scalable Neural Attention Model for Sequences with Different Lengths
Ruslan Khalitov
Tong Yu
Lei Cheng
Zhirong Yang
87
13
0
12 Jun 2022
On Data Scaling in Masked Image Modeling
On Data Scaling in Masked Image Modeling
Zhenda Xie
Zheng Zhang
Yue Cao
Yutong Lin
Yixuan Wei
Qi Dai
Han Hu
100
57
0
09 Jun 2022
CASS: Cross Architectural Self-Supervision for Medical Image Analysis
CASS: Cross Architectural Self-Supervision for Medical Image Analysis
Pranav Singh
E. Sizikova
Jacopo Cirrone
OOD
173
8
0
08 Jun 2022
Tutel: Adaptive Mixture-of-Experts at Scale
Tutel: Adaptive Mixture-of-Experts at Scale
Changho Hwang
Wei Cui
Yifan Xiong
Ziyue Yang
Ze Liu
...
Joe Chau
Peng Cheng
Fan Yang
Mao Yang
Y. Xiong
MoE
195
123
0
07 Jun 2022
Mask DINO: Towards A Unified Transformer-based Framework for Object
  Detection and Segmentation
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
Feng Li
Hao Zhang
Hu-Sheng Xu
Siyi Liu
Lei Zhang
L. Ni
H. Shum
ISeg
149
392
0
06 Jun 2022
EfficientFormer: Vision Transformers at MobileNet Speed
EfficientFormer: Vision Transformers at MobileNet Speed
Yanyu Li
Geng Yuan
Yang Wen
Eric Hu
Georgios Evangelidis
Sergey Tulyakov
Yanzhi Wang
Jian Ren
ViT
140
371
0
02 Jun 2022
KPGT: Knowledge-Guided Pre-training of Graph Transformer for Molecular
  Property Prediction
KPGT: Knowledge-Guided Pre-training of Graph Transformer for Molecular Property Prediction
Han Li
Dan Zhao
Jianyang Zeng
82
64
0
02 Jun 2022
Decomposing NeRF for Editing via Feature Field Distillation
Decomposing NeRF for Editing via Feature Field Distillation
Sosuke Kobayashi
Eiichi Matsumoto
Vincent Sitzmann
266
343
0
31 May 2022
Exploring Advances in Transformers and CNN for Skin Lesion Diagnosis on
  Small Datasets
Exploring Advances in Transformers and CNN for Skin Lesion Diagnosis on Small Datasets
Leandro M. de Lima
R. Krohling
ViTMedIm
68
11
0
30 May 2022
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing
  Mechanisms in Sequence Learning
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning
Aniket Didolkar
Kshitij Gupta
Anirudh Goyal
Nitesh B. Gundavarapu
Alex Lamb
Nan Rosemary Ke
Yoshua Bengio
AI4CE
200
18
0
30 May 2022
Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via
  Feature Distillation
Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation
Yixuan Wei
Han Hu
Zhenda Xie
Zheng Zhang
Yue Cao
Jianmin Bao
Dong Chen
B. Guo
CLIP
158
128
0
27 May 2022
How Tempering Fixes Data Augmentation in Bayesian Neural Networks
How Tempering Fixes Data Augmentation in Bayesian Neural Networks
Gregor Bachmann
Lorenzo Noci
Thomas Hofmann
BDLAAML
125
9
0
27 May 2022
Green Hierarchical Vision Transformer for Masked Image Modeling
Green Hierarchical Vision Transformer for Masked Image Modeling
Lang Huang
Shan You
Mingkai Zheng
Fei Wang
Chao Qian
T. Yamasaki
125
72
0
26 May 2022
Previous
123...151617
Next