ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.09883
  4. Cited By
Swin Transformer V2: Scaling Up Capacity and Resolution
v1v2 (latest)

Swin Transformer V2: Scaling Up Capacity and Resolution

18 November 2021
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
Yixuan Wei
Jia Ning
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
    ViT
ArXiv (abs)PDFHTMLGithub (14834★)

Papers citing "Swin Transformer V2: Scaling Up Capacity and Resolution"

50 / 840 papers shown
Title
AeroGPT: Leveraging Large-Scale Audio Model for Aero-Engine Bearing Fault Diagnosis
AeroGPT: Leveraging Large-Scale Audio Model for Aero-Engine Bearing Fault Diagnosis
Jiale Liu
Dandan Peng
Huan Wang
Chenyu Liu
Yan-Fu Li
Min Xie
5
0
0
19 Jun 2025
A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects
A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects
Guohuan Xie
Syed Ariff Syed Hesham
Wenya Guo
Bing Li
Ming-Ming Cheng
Guolei Sun
Yun-Hai Liu
26
0
0
16 Jun 2025
LARGO: Low-Rank Regulated Gradient Projection for Robust Parameter Efficient Fine-Tuning
LARGO: Low-Rank Regulated Gradient Projection for Robust Parameter Efficient Fine-Tuning
Haotian Zhang
Liu Liu
Baosheng Yu
Jiayan Qiu
Yanwei Ren
Xianglong Liu
13
0
0
14 Jun 2025
SemanticSplat: Feed-Forward 3D Scene Understanding with Language-Aware Gaussian Fields
SemanticSplat: Feed-Forward 3D Scene Understanding with Language-Aware Gaussian Fields
Qijing Li
Jingxiang Sun
Liang An
Zhaoqi Su
Hongwen Zhang
Yebin Liu
52
1
0
11 Jun 2025
Canonical Latent Representations in Conditional Diffusion Models
Yitao Xu
Tong Zhang
Ehsan Pajouheshgar
Sabine Süsstrunk
DiffM
77
0
0
11 Jun 2025
DeepTraverse: A Depth-First Search Inspired Network for Algorithmic Visual Understanding
DeepTraverse: A Depth-First Search Inspired Network for Algorithmic Visual Understanding
Bin Guo
John H.L. Hansen
68
0
0
11 Jun 2025
MedChat: A Multi-Agent Framework for Multimodal Diagnosis with Large Language Models
MedChat: A Multi-Agent Framework for Multimodal Diagnosis with Large Language Models
Philip R. Liu
Sparsh Bansal
Jimmy Dinh
Aditya Pawar
Ramani Satishkumar
Shail Desai
Neeraj Gupta
X. Wang
S. Hu
LM&MA
45
0
0
09 Jun 2025
Can Foundation Models Generalise the Presentation Attack Detection Capabilities on ID Cards?
Juan E. Tapia
Christoph Busch
80
0
0
05 Jun 2025
FuXi-Ocean: A Global Ocean Forecasting System with Sub-Daily Resolution
FuXi-Ocean: A Global Ocean Forecasting System with Sub-Daily Resolution
Qiusheng Huang
Yuan Niu
Xiaohui Zhong
Anboyu Guo
Lei Chen
Dianjun Zhang
Xuefeng Zhang
Hao Li
AI4Cl
31
0
0
03 Jun 2025
RoadFormer : Local-Global Feature Fusion for Road Surface Classification in Autonomous Driving
RoadFormer : Local-Global Feature Fusion for Road Surface Classification in Autonomous Driving
Tianze Wang
Zhang Zhang
Chao Sun
39
0
0
03 Jun 2025
Learning Sparsity for Effective and Efficient Music Performance Question Answering
Learning Sparsity for Effective and Efficient Music Performance Question Answering
Xingjian Diao
Tianzhen Yang
Chunhui Zhang
Weiyi Wu
Ming Cheng
Jiang Gui
60
1
0
02 Jun 2025
PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations
PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations
Benjamin Holzschuh
Qiang Liu
Georg Kohl
Nils Thuerey
AI4CE
30
1
0
30 May 2025
Hallo4: High-Fidelity Dynamic Portrait Animation via Direct Preference Optimization and Temporal Motion Modulation
Hallo4: High-Fidelity Dynamic Portrait Animation via Direct Preference Optimization and Temporal Motion Modulation
Jiahao Cui
Yan Chen
Mingwang Xu
Hanlin Shang
Yuxuan Chen
Yun Zhan
Zilong Dong
Yao Yao
Jingdong Wang
Siyu Zhu
DiffMVGen
60
0
0
29 May 2025
FeatInv: Spatially resolved mapping from feature space to input space using conditional diffusion models
FeatInv: Spatially resolved mapping from feature space to input space using conditional diffusion models
Nils Neukirch
Johanna Vielhaben
Nils Strodthoff
DiffM
36
0
0
27 May 2025
The Missing Point in Vision Transformers for Universal Image Segmentation
The Missing Point in Vision Transformers for Universal Image Segmentation
Sajjad Shahabodini
Mobina Mansoori
Farnoush Bayatmakou
J. Abouei
Konstantinos N. Plataniotis
Arash Mohammadi
ViTISeg
26
0
0
26 May 2025
Asymmetric Duos: Sidekicks Improve Uncertainty
Asymmetric Duos: Sidekicks Improve Uncertainty
Tim G. Zhou
Evan Shelhamer
Geoff Pleiss
UQCV
51
0
0
24 May 2025
Mahalanobis++: Improving OOD Detection via Feature Normalization
Mahalanobis++: Improving OOD Detection via Feature Normalization
Maximilian Mueller
Matthias Hein
OODD
127
1
0
23 May 2025
EGFormer: Towards Efficient and Generalizable Multimodal Semantic Segmentation
EGFormer: Towards Efficient and Generalizable Multimodal Semantic Segmentation
Zelin Zhang
Tao Zhang
KediLI
Xu Zheng
71
0
0
20 May 2025
RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge Injection
RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge Injection
Wenjun Hou
Yi Cheng
Kaishuai Xu
Heng Li
Yan Hu
Wenjie Li
Jiang Liu
103
0
0
20 May 2025
Mamba-Adaptor: State Space Model Adaptor for Visual Recognition
Mamba-Adaptor: State Space Model Adaptor for Visual Recognition
Fei Xie
Jiahao Nie
Yujin Tang
W. Zhang
Hongshen Zhao
Mamba
142
0
0
19 May 2025
TiMo: Spatiotemporal Foundation Model for Satellite Image Time Series
TiMo: Spatiotemporal Foundation Model for Satellite Image Time Series
Xiaolei Qin
Di Wang
Jing Zhang
Fengxiang Wang
Xin Su
Bo Du
Liangpei Zhang
AI4TS
114
0
0
13 May 2025
SynID: Passport Synthetic Dataset for Presentation Attack Detection
SynID: Passport Synthetic Dataset for Presentation Attack Detection
Juan E. Tapia
Fabian Stockhardt
Lázaro J. González Soler
Christoph Busch
142
1
0
12 May 2025
Adapting a Segmentation Foundation Model for Medical Image Classification
Adapting a Segmentation Foundation Model for Medical Image Classification
Pengfei Gu
Haoteng Tang
Islam A. Ebeid
Jose Angel Nuñez
Fabian Vazquez
Diego Adame
Marcus Zhan
Huimin Li
Bin Fu
Danny Chen
MedImVLM
69
0
0
09 May 2025
ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling
ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling
Xiao Wang
Jong Youl Choi
Takuya Kurihaya
Isaac Lyngaas
Hong-Jun Yoon
...
Dali Wang
Peter Thornton
Prasanna Balaprakash
M. Ashfaq
Dan Lu
58
0
0
07 May 2025
Stow: Robotic Packing of Items into Fabric Pods
Stow: Robotic Packing of Items into Fabric Pods
Nicolas Hudson
Josh Hooks
Rahul Warrier
Curt Salisbury
Ross Hartley
...
Christine Fuller
Alex Keklak
Alex Frenkel
Lillian J. Ratliff
Aaron Parness
76
1
0
07 May 2025
Rethinking Boundary Detection in Deep Learning-Based Medical Image Segmentation
Rethinking Boundary Detection in Deep Learning-Based Medical Image Segmentation
Yi Lin
Dong Zhang
X. B. Fang
Yufan Chen
K.-T. Cheng
Hao Chen
55
0
0
06 May 2025
SCOPE-MRI: Bankart Lesion Detection as a Case Study in Data Curation and Deep Learning for Challenging Diagnoses
SCOPE-MRI: Bankart Lesion Detection as a Case Study in Data Curation and Deep Learning for Challenging Diagnoses
Sahil Sethi
Sai Reddy
Mansi Sakarvadia
Jordan Serotte
Darlington Nwaudo
Nicholas Maassen
Lewis Shi
82
0
0
29 Apr 2025
Prompt Guiding Multi-Scale Adaptive Sparse Representation-driven Network for Low-Dose CT MAR
Prompt Guiding Multi-Scale Adaptive Sparse Representation-driven Network for Low-Dose CT MAR
Baoshun Shi
Bing Chen
Shaolei Zhang
Huazhu Fu
Zhanli Hu
MedIm
70
0
0
28 Apr 2025
Examining the Impact of Optical Aberrations to Image Classification and Object Detection Models
Examining the Impact of Optical Aberrations to Image Classification and Object Detection Models
Patrick Müller
Alexander Braun
Margret Keuper
102
0
0
25 Apr 2025
High-Quality Cloud-Free Optical Image Synthesis Using Multi-Temporal SAR and Contaminated Optical Data
High-Quality Cloud-Free Optical Image Synthesis Using Multi-Temporal SAR and Contaminated Optical Data
Chenxi Duan
118
0
0
23 Apr 2025
LOOPE: Learnable Optimal Patch Order in Positional Embeddings for Vision Transformers
LOOPE: Learnable Optimal Patch Order in Positional Embeddings for Vision Transformers
M. Chowdhury
Md Rifat Ur Rahman
Akil Ahmad Taki
57
0
0
19 Apr 2025
Learning from Noisy Pseudo-labels for All-Weather Land Cover Mapping
Learning from Noisy Pseudo-labels for All-Weather Land Cover Mapping
Wang Liu
Zhiyu Wang
Xin Guo
Puhong Duan
Xudong Kang
Shutao Li
80
0
0
18 Apr 2025
BeetleVerse: A study on taxonomic classification of ground beetles
BeetleVerse: A study on taxonomic classification of ground beetles
S M Rayeed
Alyson East
Samuel Stevens
Sydne Record
Charles V. Stewart
53
0
0
18 Apr 2025
Towards Scale-Aware Low-Light Enhancement via Structure-Guided Transformer Design
Towards Scale-Aware Low-Light Enhancement via Structure-Guided Transformer Design
Wei Dong
Yan Min
Han Zhou
Jun Chen
ViT
70
0
0
18 Apr 2025
NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and Results
NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and Results
Xin Li
Kun Yuan
B. Li
Fengbin Guan
Yizhen Shao
...
Guohua Zhang
Z. Huang
Y. Deng
Qingmiao Jiang
Lu Chen
127
15
0
17 Apr 2025
Plain Transformers Can be Powerful Graph Learners
Plain Transformers Can be Powerful Graph Learners
Liheng Ma
Soumyasundar Pal
Yingxue Zhang
Philip Torr
Mark Coates
75
0
0
17 Apr 2025
SAR Object Detection with Self-Supervised Pretraining and Curriculum-Aware Sampling
SAR Object Detection with Self-Supervised Pretraining and Curriculum-Aware Sampling
Yasin Almalioglu
Andrzej Kucik
Geoffrey French
Dafni Antotsiou
Alexander Adam
Cedric Archambeau
77
0
0
17 Apr 2025
Perception Encoder: The best visual embeddings are not at the output of the network
Perception Encoder: The best visual embeddings are not at the output of the network
Daniel Bolya
Po-Yao (Bernie) Huang
Peize Sun
Jang Hyun Cho
Andrea Madotto
...
Shiyu Dong
Nikhila Ravi
Daniel Li
Piotr Dollár
Christoph Feichtenhofer
ObjDVOS
327
9
0
17 Apr 2025
Metric-Solver: Sliding Anchored Metric Depth Estimation from a Single Image
Metric-Solver: Sliding Anchored Metric Depth Estimation from a Single Image
Tao Wen
Jiadong Wang
Yuxiao Chen
Shugong Xu
Chi Zhang
Xuelong Li
MDE
119
0
0
16 Apr 2025
SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification
SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification
Xiang Hu
Pingping Zhang
Yuhao Wang
Bin Yan
Huchuan Lu
58
0
0
13 Apr 2025
Tokenize Image Patches: Global Context Fusion for Effective Haze Removal in Large Images
Tokenize Image Patches: Global Context Fusion for Effective Haze Removal in Large Images
Jiuchen Chen
Xinyu Yan
Qizhi Xu
Kaiqi Li
VLM
66
1
0
13 Apr 2025
Hyperlocal disaster damage assessment using bi-temporal street-view imagery and pre-trained vision models
Hyperlocal disaster damage assessment using bi-temporal street-view imagery and pre-trained vision models
Yifan Yang
Lei Zou
Bing Zhou
Daoyang Li
Binbin Lin
J. Abedin
Mingzheng Yang
54
0
0
12 Apr 2025
Mixture of Group Experts for Learning Invariant Representations
Mixture of Group Experts for Learning Invariant Representations
Lei Kang
Jia Li
Mi Tian
Hua Huang
MoE
124
0
0
12 Apr 2025
Heart Failure Prediction using Modal Decomposition and Masked Autoencoders for Scarce Echocardiography Databases
Heart Failure Prediction using Modal Decomposition and Masked Autoencoders for Scarce Echocardiography Databases
Andrés Bell-Navas
M. Villalba-Orero
Enrique Lara Pezzi
J. Garicano-Mena
S. L. Clainche
215
0
0
10 Apr 2025
Audio-visual Event Localization on Portrait Mode Short Videos
Audio-visual Event Localization on Portrait Mode Short Videos
Wuyang Liu
Yi Chai
Yongpeng Yan
Yanzhen Ren
68
1
0
09 Apr 2025
A Robust Real-Time Lane Detection Method with Fog-Enhanced Feature Fusion for Foggy Conditions
A Robust Real-Time Lane Detection Method with Fog-Enhanced Feature Fusion for Foggy Conditions
Ronghui Zhang
Yuhang Ma
Tengfei Li
Ziyu Lin
Yueying Wu
Junzhou Chen
Lin Zhang
Jia Hu
Tony Z. Qiu
Konghui Guo
146
0
0
08 Apr 2025
EMF: Event Meta Formers for Event-based Real-time Traffic Object Detection
EMF: Event Meta Formers for Event-based Real-time Traffic Object Detection
Muhammad Ahmed Ullah Khan
Abdul Hannan Khan
Andreas Dengel
82
0
0
05 Apr 2025
Spline-based Transformers
Spline-based Transformers
Prashanth Chandran
Agon Serifi
Markus Gross
Moritz Bächer
154
0
0
03 Apr 2025
Rip Current Segmentation: A Novel Benchmark and YOLOv8 Baseline Results
Rip Current Segmentation: A Novel Benchmark and YOLOv8 Baseline Results
Andrei Dumitriu
Florin Tatui
Florin Miron
Radu Tudor Ionescu
Radu Timofte
221
24
0
03 Apr 2025
FLAMES: A Hybrid Spiking-State Space Model for Adaptive Memory Retention in Event-Based Learning
FLAMES: A Hybrid Spiking-State Space Model for Adaptive Memory Retention in Event-Based Learning
Biswadeep Chakraborty
Saibal Mukhopadhyay
163
0
0
02 Apr 2025
1234...151617
Next