ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.09883
  4. Cited By
Swin Transformer V2: Scaling Up Capacity and Resolution

Swin Transformer V2: Scaling Up Capacity and Resolution

18 November 2021
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
Yixuan Wei
Jia Ning
Yue Cao
Zheng-Wei Zhang
Li Dong
Furu Wei
B. Guo
    ViT
ArXivPDFHTML

Papers citing "Swin Transformer V2: Scaling Up Capacity and Resolution"

50 / 823 papers shown
Title
Making Vision Transformers Truly Shift-Equivariant
Making Vision Transformers Truly Shift-Equivariant
Renan A. Rojas-Gomez
Teck-Yian Lim
Minh N. Do
Raymond A. Yeh
ViT
36
7
0
25 May 2023
Reversible Graph Neural Network-based Reaction Distribution Learning for
  Multiple Appropriate Facial Reactions Generation
Reversible Graph Neural Network-based Reaction Distribution Learning for Multiple Appropriate Facial Reactions Generation
Tong Xu
Micol Spitale
Haozhan Tang
Lu Liu
Hatice Gunes
Siyang Song
CVBM
34
10
0
24 May 2023
A Study on Deep CNN Structures for Defect Detection From Laser
  Ultrasonic Visualization Testing Images
A Study on Deep CNN Structures for Defect Detection From Laser Ultrasonic Visualization Testing Images
Miya Nakajima
Takahiro Saitoh
Tsuyoshi Kato
37
2
0
23 May 2023
Efficient Large-Scale Visual Representation Learning And Evaluation
Efficient Large-Scale Visual Representation Learning And Evaluation
Eden Dolev
A. Awad
Denisa Roberts
Zahra Ebrahimzadeh
Marcin Mejran
Vaibhav Malpani
Mahir Yavuz
45
0
0
22 May 2023
Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design
Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design
Ibrahim M. Alabdulmohsin
Xiaohua Zhai
Alexander Kolesnikov
Lucas Beyer
VLM
42
58
0
22 May 2023
A Dual-level Detection Method for Video Copy Detection
A Dual-level Detection Method for Video Copy Detection
Tianyi Wang
Feipeng Ma
Zhenhua Liu
Fengyun Rao
27
3
0
21 May 2023
PointGPT: Auto-regressively Generative Pre-training from Point Clouds
PointGPT: Auto-regressively Generative Pre-training from Point Clouds
Guang-Sheng Chen
Meiling Wang
Yi Yang
Kai Yu
Li-xin Yuan
Yufeng Yue
3DPC
24
80
0
19 May 2023
Reciprocal Attention Mixing Transformer for Lightweight Image
  Restoration
Reciprocal Attention Mixing Transformer for Lightweight Image Restoration
Haram Choi
Cheolwoong Na
Jihyeon Oh
Seungjae Lee
Jinseop S. Kim
Subeen Choe
Jeongmin Lee
Taehoon Kim
Jihoon Yang
51
5
0
19 May 2023
ONE-PEACE: Exploring One General Representation Model Toward Unlimited
  Modalities
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Peng Wang
Shijie Wang
Junyang Lin
Shuai Bai
Xiaohuan Zhou
Jingren Zhou
Xinggang Wang
Chang Zhou
VLM
MLLM
ObjD
48
115
0
18 May 2023
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
Chong Yu
Tao Chen
Zhongxue Gan
Jiayuan Fan
MQ
ViT
30
23
0
18 May 2023
ICDAR 2023 Competition on Hierarchical Text Detection and Recognition
ICDAR 2023 Competition on Hierarchical Text Detection and Recognition
Shangbang Long
Siyang Qin
Dmitry Panteleev
Alessandro Bissacco
Yasuhisa Fujii
Michalis Raptis
VLM
45
17
0
16 May 2023
PanelNet: Understanding 360 Indoor Environment via Panel Representation
PanelNet: Understanding 360 Indoor Environment via Panel Representation
Haozheng Yu
Lu He
B. Jian
Weiwei Feng
Shanghua Liu
MDE
3DV
53
17
0
16 May 2023
A Comprehensive Survey on Segment Anything Model for Vision and Beyond
A Comprehensive Survey on Segment Anything Model for Vision and Beyond
Chunhui Zhang
Li Liu
Yawen Cui
Guanjie Huang
Weilin Lin
Yiqian Yang
Yuehong Hu
VLM
43
90
0
14 May 2023
OneCAD: One Classifier for All image Datasets using multimodal learning
OneCAD: One Classifier for All image Datasets using multimodal learning
S. Wadekar
Eugenio Culurciello
40
0
0
11 May 2023
EfficientViT: Memory Efficient Vision Transformer with Cascaded Group
  Attention
EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention
Xinyu Liu
Houwen Peng
Ningxin Zheng
Yuqing Yang
Han Hu
Yixuan Yuan
ViT
25
277
0
11 May 2023
Mlinear: Rethink the Linear Model for Time-series Forecasting
Mlinear: Rethink the Linear Model for Time-series Forecasting
Wei Li
Xiangxu Meng
Chuhao Chen
Jianing Chen
AI4TS
26
6
0
08 May 2023
Understanding Gaussian Attention Bias of Vision Transformers Using
  Effective Receptive Fields
Understanding Gaussian Attention Bias of Vision Transformers Using Effective Receptive Fields
Bum Jun Kim
Hyeyeon Choi
Hyeonah Jang
Sang Woo Kim
ViT
32
3
0
08 May 2023
DBAT: Dynamic Backward Attention Transformer for Material Segmentation
  with Cross-Resolution Patches
DBAT: Dynamic Backward Attention Transformer for Material Segmentation with Cross-Resolution Patches
Yuwen Heng
S. Dasmahapatra
Hansung Kim
24
1
0
06 May 2023
Adversarially-Guided Portrait Matting
Adversarially-Guided Portrait Matting
Sergej Chicherin
Karen Efremyan
28
3
0
04 May 2023
Cross-Shaped Windows Transformer with Self-supervised Pretraining for
  Clinically Significant Prostate Cancer Detection in Bi-parametric MRI
Cross-Shaped Windows Transformer with Self-supervised Pretraining for Clinically Significant Prostate Cancer Detection in Bi-parametric MRI
Yuheng Li
Jacob F. Wynne
Jing Wang
Richard L. J. Qiu
J. Roper
...
A. Jani
Tian Liu
P. Patel
H. Mao
Xiaofeng Yang
OOD
ViT
MedIm
30
10
0
30 Apr 2023
Advancing Ischemic Stroke Diagnosis: A Novel Two-Stage Approach for
  Blood Clot Origin Identification
Advancing Ischemic Stroke Diagnosis: A Novel Two-Stage Approach for Blood Clot Origin Identification
Koushik Sivarama Krishnan
P. J. J. Nikesh
Swathi Gnanasekar
Karthik Sivarama Krishnan
24
0
0
26 Apr 2023
From Chaos Comes Order: Ordering Event Representations for Object
  Recognition and Detection
From Chaos Comes Order: Ordering Event Representations for Object Recognition and Detection
Nikola Zubić
Daniel Gehrig
Mathias Gehrig
Davide Scaramuzza
AI4TS
30
36
0
26 Apr 2023
A Strong and Reproducible Object Detector with Only Public Datasets
A Strong and Reproducible Object Detector with Only Public Datasets
Tianhe Ren
Jianwei Yang
Siyi Liu
Ailing Zeng
Feng Li
Hao Zhang
Hongyang Li
Zhaoyang Zeng
Lei Zhang
ObjD
41
11
0
25 Apr 2023
SwinFSR: Stereo Image Super-Resolution using SwinIR and Frequency Domain
  Knowledge
SwinFSR: Stereo Image Super-Resolution using SwinIR and Frequency Domain Knowledge
Ke-Jia Chen
Liangyan Li
Huan Liu
Yunzhe Li
Congling Tang
Jun Chen
37
14
0
25 Apr 2023
Hint-Aug: Drawing Hints from Foundation Vision Transformers Towards
  Boosted Few-Shot Parameter-Efficient Tuning
Hint-Aug: Drawing Hints from Foundation Vision Transformers Towards Boosted Few-Shot Parameter-Efficient Tuning
Zhongzhi Yu
Shang Wu
Y. Fu
Shunyao Zhang
Yingyan Lin
33
6
0
25 Apr 2023
LipsFormer: Introducing Lipschitz Continuity to Vision Transformers
LipsFormer: Introducing Lipschitz Continuity to Vision Transformers
Xianbiao Qi
Jianan Wang
Yihao Chen
Yukai Shi
Lei Zhang
46
16
0
19 Apr 2023
Transformer-Based Visual Segmentation: A Survey
Transformer-Based Visual Segmentation: A Survey
Xiangtai Li
Henghui Ding
Haobo Yuan
Wenwei Zhang
Jiangmiao Pang
Guangliang Cheng
Kai-xiang Chen
Ziwei Liu
Chen Change Loy
ViT
MedIm
42
132
0
19 Apr 2023
MMDR: A Result Feature Fusion Object Detection Approach for Autonomous
  System
MMDR: A Result Feature Fusion Object Detection Approach for Autonomous System
Wendong Zhang
27
0
0
19 Apr 2023
A Data-Centric Solution to NonHomogeneous Dehazing via Vision
  Transformer
A Data-Centric Solution to NonHomogeneous Dehazing via Vision Transformer
Yangyi Liu
Huan Liu
Liangyan Li
Zijun Wu
Jun Chen
32
15
0
16 Apr 2023
EGformer: Equirectangular Geometry-biased Transformer for 360 Depth
  Estimation
EGformer: Equirectangular Geometry-biased Transformer for 360 Depth Estimation
Ilwi Yun
Chanyong Shin
Hyunku Lee
Hyuk-Jae Lee
Chae-Eun Rhee
ViT
MDE
32
17
0
16 Apr 2023
The Second Monocular Depth Estimation Challenge
The Second Monocular Depth Estimation Challenge
Jaime Spencer
Chao Qian
Michaela Trescakova
Chris Russell
Simon Hadfield
...
Guangkai Xu
Wei Yin
Jun Yu
Qi Zhang
Chaoqiang Zhao
MDE
32
11
0
14 Apr 2023
Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene
  Understanding
Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding
Yu-Qi Yang
Yu-Xiao Guo
Jiangfeng Xiong
Yang Liu
Hao Pan
Peng-Shuai Wang
Xin Tong
B. Guo
ViT
30
77
0
14 Apr 2023
SpectFormer: Frequency and Attention is what you need in a Vision
  Transformer
SpectFormer: Frequency and Attention is what you need in a Vision Transformer
Badri N. Patro
Vinay P. Namboodiri
Vijay Srinivas Agneeswaran
ViT
35
47
0
13 Apr 2023
RECLIP: Resource-efficient CLIP by Training with Small Images
RECLIP: Resource-efficient CLIP by Training with Small Images
Runze Li
Dahun Kim
B. Bhanu
Weicheng Kuo
VLM
CLIP
36
13
0
12 Apr 2023
Hard Patches Mining for Masked Image Modeling
Hard Patches Mining for Masked Image Modeling
Haochen Wang
Kaiyou Song
Junsong Fan
Yuxi Wang
Jin Xie
Zhaoxiang Zhang
37
59
0
12 Apr 2023
Detection Transformer with Stable Matching
Detection Transformer with Stable Matching
Siyi Liu
Tianhe Ren
Jia-Yu Chen
Zhaoyang Zeng
Hao Zhang
...
Hongyang Li
Jun Huang
Hang Su
Jun Zhu
Lei Zhang
33
34
0
10 Apr 2023
Micron-BERT: BERT-based Facial Micro-Expression Recognition
Micron-BERT: BERT-based Facial Micro-Expression Recognition
Xuan-Bac Nguyen
C. Duong
Xin Li
Susan Gauch
Han-Seok Seo
Khoa Luu
33
49
0
06 Apr 2023
PointCAT: Cross-Attention Transformer for point cloud
PointCAT: Cross-Attention Transformer for point cloud
Xincheng Yang
Mingze Jin
Weiji He
Qian Chen
3DPC
ViT
27
3
0
06 Apr 2023
BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer
  for 4K Video Frame Interpolation
BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer for 4K Video Frame Interpolation
Jun-ho Park
Jintae Kim
Chang-Su Kim
30
20
0
05 Apr 2023
BugNIST -- a Large Volumetric Dataset for Object Detection under Domain
  Shift
BugNIST -- a Large Volumetric Dataset for Object Detection under Domain Shift
Patrick M. Jensen
Anders B. Dahl
Carsten Gundlach
Rebecca J Engberg
H. Kjer
Vedrana Andersen Dahl
36
1
0
04 Apr 2023
Exploration of Lightweight Single Image Denoising with Transformers and
  Truly Fair Training
Exploration of Lightweight Single Image Denoising with Transformers and Truly Fair Training
Haram Choi
Cheolwoong Na
Jinseop S. Kim
Jihoon Yang
ViT
46
3
0
04 Apr 2023
Exploring Vision-Language Models for Imbalanced Learning
Exploring Vision-Language Models for Imbalanced Learning
Yidong Wang
Zhuohao Yu
Jindong Wang
Qiang Heng
Haoxing Chen
Wei Ye
Rui Xie
Xingxu Xie
Shi-Bo Zhang
VLM
46
30
0
04 Apr 2023
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution
  Vision Transformer
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
Xuanyao Chen
Zhijian Liu
Haotian Tang
Li Yi
Hang Zhao
Song Han
ViT
29
46
0
30 Mar 2023
VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
Limin Wang
Bingkun Huang
Zhiyu Zhao
Zhan Tong
Yinan He
Yi Wang
Yali Wang
Yu Qiao
VGen
71
329
0
29 Mar 2023
Scalable, Detailed and Mask-Free Universal Photometric Stereo
Scalable, Detailed and Mask-Free Universal Photometric Stereo
Satoshi Ikehata
31
31
0
28 Mar 2023
SwiftFormer: Efficient Additive Attention for Transformer-based
  Real-time Mobile Vision Applications
SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
Abdelrahman M. Shaker
Muhammad Maaz
H. Rasheed
Salman Khan
Ming Yang
Fahad Shahbaz Khan
ViT
50
84
0
27 Mar 2023
Vision Transformer with Quadrangle Attention
Vision Transformer with Quadrangle Attention
Qiming Zhang
Jing Zhang
Yufei Xu
Dacheng Tao
ViT
24
38
0
27 Mar 2023
CP-CNN: Core-Periphery Principle Guided Convolutional Neural Network
CP-CNN: Core-Periphery Principle Guided Convolutional Neural Network
Lin Zhao
Haixing Dai
Zihao Wu
Dajiang Zhu
Tianming Liu
38
1
0
27 Mar 2023
Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient
  Vision Transformers
Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers
Cong Wei
Brendan Duke
R. Jiang
P. Aarabi
Graham W. Taylor
Florian Shkurti
ViT
46
14
0
24 Mar 2023
WM-MoE: Weather-aware Multi-scale Mixture-of-Experts for Blind Adverse
  Weather Removal
WM-MoE: Weather-aware Multi-scale Mixture-of-Experts for Blind Adverse Weather Removal
Yulin Luo
Rui Zhao
Xi Wei
Jinwei Chen
Yijie Lu
Shenghao Xie
Tianyu Wang
Ruiqin Xiong
Ming Lu
Shanghang Zhang
31
3
0
24 Mar 2023
Previous
123...111213...151617
Next