Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.09883
Cited By
Swin Transformer V2: Scaling Up Capacity and Resolution
18 November 2021
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
Yixuan Wei
Jia Ning
Yue Cao
Zheng-Wei Zhang
Li Dong
Furu Wei
B. Guo
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Swin Transformer V2: Scaling Up Capacity and Resolution"
50 / 823 papers shown
Title
Scale-Aware Modulation Meet Transformer
Wei-Shiang Lin
Ziheng Wu
Jiayu Chen
Jun Huang
Lianwen Jin
MoE
ViT
33
66
0
17 Jul 2023
Active Learning for Object Detection with Non-Redundant Informative Sampling
A. Hekimoglu
A. Brucker
A. Kayalı
Michael Schmidt
Alvaro Marcos-Ramiro
32
1
0
17 Jul 2023
HEAL-SWIN: A Vision Transformer On The Sphere
Oscar Carlsson
Jan E. Gerken
Hampus Linander
Heiner Spiess
F. Ohlsson
Christoffer Petersson
Daniel Persson
ViT
MedIm
29
6
0
14 Jul 2023
Achelous: A Fast Unified Water-surface Panoptic Perception Framework based on Fusion of Monocular Camera and 4D mmWave Radar
Runwei Guan
Shanliang Yao
Xiaohui Zhu
Ka Lok Man
Eng Gee Lim
Jeremy S. Smith
Yong 0001Yue
Yutao Yue
VOS
32
17
0
14 Jul 2023
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition
Syed Talal Wasim
Muhammad Uzair Khattak
Muzammal Naseer
Salman Khan
M. Shah
Fahad Shahbaz Khan
ViT
54
19
0
13 Jul 2023
Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Mostafa Dehghani
Basil Mustafa
Josip Djolonga
Jonathan Heek
Matthias Minderer
...
Avital Oliver
Piotr Padlewski
A. Gritsenko
Mario Luvcić
N. Houlsby
ViT
26
105
0
12 Jul 2023
Cross-modal Orthogonal High-rank Augmentation for RGB-Event Transformer-trackers
Zhiyu Zhu
Junhui Hou
Dapeng Wu
ViT
24
28
0
09 Jul 2023
Steel Surface Roughness Parameter Calculations Using Lasers and Machine Learning Models
A. Milne
Xianghua Xie
AI4CE
29
0
0
06 Jul 2023
Art Authentication with Vision Transformers
Ludovica Schaerf
Carina Popovici
Eric Postma
ViT
14
9
0
06 Jul 2023
Multimodal Temporal Fusion Transformers Are Good Product Demand Forecasters
M. Sukel
S. Rudinac
M. Worring
AI4TS
36
1
0
05 Jul 2023
MSViT: Dynamic Mixed-Scale Tokenization for Vision Transformers
Jakob Drachmann Havtorn
Amelie Royer
Tijmen Blankevoort
B. Bejnordi
30
8
0
05 Jul 2023
Review of Large Vision Models and Visual Prompt Engineering
Jiaqi Wang
Zheng Liu
Lin Zhao
Zihao Wu
Chong Ma
...
Bao Ge
Yixuan Yuan
Dinggang Shen
Tianming Liu
Shu Zhang
VLM
LRM
55
147
0
03 Jul 2023
Robust Surgical Tools Detection in Endoscopic Videos with Noisy Data
Adnan Qayyum
Hassan Ali
Massimo Caputo
H. Vohra
Taofeek Akinosho
Sofiat Abioye
Ilhem Berrou
Paweł Capik
Junaid Qadir
Muhammad Bilal
41
0
0
03 Jul 2023
Learning Content-enhanced Mask Transformer for Domain Generalized Urban-Scene Segmentation
Qi Bi
Shaodi You
Theo Gevers
ViT
42
39
0
01 Jul 2023
FedBone: Towards Large-Scale Federated Multi-Task Learning
Yiqiang Chen
Teng Zhang
Xinlong Jiang
Qian Chen
Chenlong Gao
Wuliang Huang
FedML
AI4CE
24
11
0
30 Jun 2023
Leveraging Cross-Utterance Context For ASR Decoding
Robert Flynn
Anton Ragni
33
1
0
29 Jun 2023
BinaryViT: Pushing Binary Vision Transformers Towards Convolutional Models
Phuoc-Hoan Charles Le
Xinlin Li
ViT
MQ
27
21
0
29 Jun 2023
EgoCOL: Egocentric Camera pose estimation for Open-world 3D object Localization @Ego4D challenge 2023
Cristhian Forigua
María Escobar
Jordi Pont-Tuset
Kevis-Kokitsi Maninis
Pablo Arbelaez
EgoV
28
1
0
29 Jun 2023
On Practical Aspects of Aggregation Defenses against Data Poisoning Attacks
Wenxiao Wang
S. Feizi
AAML
32
1
0
28 Jun 2023
RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation based on Visual Foundation Model
Keyan Chen
Chenyang Liu
Hao Chen
Haotian Zhang
Wenyuan Li
Zhengxia Zou
Z. Shi
VLM
21
202
0
28 Jun 2023
Cutting-Edge Techniques for Depth Map Super-Resolution
Ryan Peterson
Josiah W. Smith
SupR
25
0
0
27 Jun 2023
Swin-Free: Achieving Better Cross-Window Attention and Efficiency with Size-varying Window
Jinkyu Koo
John Yang
Le An
Gwenaelle Cunha Sergio
Su Inn Park
ViT
35
0
0
23 Jun 2023
FuXi: A cascade machine learning forecasting system for 15-day global weather forecast
Lei Chen
Xiaohui Zhong
Feng-jun Zhang
Yuan Cheng
Yinghui Xu
Yuan Qi
Hao Li
AI4Cl
28
206
0
22 Jun 2023
Multi-Task Consistency for Active Learning
A. Hekimoglu
Philipp Friedrich
Walter Zimmer
Michael Schmidt
Alvaro Marcos-Ramiro
Alois C. Knoll
VLM
15
10
0
21 Jun 2023
ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM Pretraining
Dezhi Peng
Chongyu Liu
Yuliang Liu
Lianwen Jin
DiffM
27
9
0
21 Jun 2023
Dynamic Perceiver for Efficient Visual Recognition
Yizeng Han
Dongchen Han
Zeyu Liu
Yulin Wang
Xuran Pan
Yifan Pu
Chaorui Deng
Junlan Feng
S. Song
Gao Huang
32
29
0
20 Jun 2023
NAR-Former V2: Rethinking Transformer for Universal Neural Network Representation Learning
Yun Yi
Haokui Zhang
Rong Xiao
Nan Wang
Xiaoyu Wang
GNN
41
2
0
19 Jun 2023
Towards Stability of Autoregressive Neural Operators
Michael McCabe
P. Harrington
Shashank Subramanian
Jed Brown
AI4CE
44
17
0
18 Jun 2023
PAtt-Lite: Lightweight Patch and Attention MobileNet for Challenging Facial Expression Recognition
Jia Le Ngwe
K. Lim
C. Lee
T. Ong
CVBM
40
11
0
16 Jun 2023
Understanding Optimization of Deep Learning via Jacobian Matrix and Lipschitz Constant
Xianbiao Qi
Jianan Wang
Lei Zhang
21
0
0
15 Jun 2023
Explore In-Context Learning for 3D Point Cloud Understanding
Zhongbin Fang
Xiangtai Li
Xia Li
J. M. Buhmann
Chen Change Loy
Mengyuan Liu
3DPC
33
24
0
14 Jun 2023
Reviving Shift Equivariance in Vision Transformers
Peijian Ding
Davit Soselia
Thomas Armstrong
Jiahao Su
Furong Huang
25
7
0
13 Jun 2023
Learning to Mask and Permute Visual Tokens for Vision Transformer Pre-Training
Lorenzo Baraldi
Roberto Amoroso
Marcella Cornia
Lorenzo Baraldi
Andrea Pilzer
Rita Cucchiara
38
2
0
12 Jun 2023
FalconNet: Factorization for the Light-weight ConvNets
Zhicheng Cai
Qiu Shen
32
11
0
10 Jun 2023
FasterViT: Fast Vision Transformers with Hierarchical Attention
Ali Hatamizadeh
Greg Heinrich
Hongxu Yin
Andrew Tao
J. Álvarez
Jan Kautz
Pavlo Molchanov
ViT
28
68
0
09 Jun 2023
RDumb: A simple approach that questions our progress in continual test-time adaptation
Ori Press
Steffen Schneider
Matthias Kümmerer
Matthias Bethge
TTA
25
28
0
08 Jun 2023
Efficient Multi-Task Scene Analysis with RGB-D Transformers
Söhnke Benedikt Fischedick
Daniel Seichter
Robin M. Schmidt
Leonard Rabes
H. Groß
27
9
0
08 Jun 2023
Genomic Interpreter: A Hierarchical Genomic Deep Neural Network with 1D Shifted Window Transformer
Zehui Li
Akashaditya Das
W. Beardall
Yiren Zhao
Guy-Bart Stan
18
4
0
08 Jun 2023
Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards
Alexandre Ramé
Guillaume Couairon
Mustafa Shukor
Corentin Dancette
Jean-Baptiste Gaya
Laure Soulier
Matthieu Cord
MoMe
35
136
0
07 Jun 2023
GEO-Bench: Toward Foundation Models for Earth Monitoring
Alexandre Lacoste
Nils Lehmann
Pau Rodríguez López
Evan D. Sherwin
Hannah Kerner
...
David Vazquez
Dava Newman
Yoshua Bengio
Stefano Ermon
Xiao Xiang Zhu
SSL
ALM
AI4CE
14
56
0
06 Jun 2023
Industrial Anomaly Detection and Localization Using Weakly-Supervised Residual Transformers
Hanxi Li
Jing Wu
Lin Yuanbo Wu
Hao Chen
Deyin Liu
Mingwen Wang
Peng Wang
ViT
42
4
0
06 Jun 2023
Adversarial alignment: Breaking the trade-off between the strength of an attack and its relevance to human perception
Drew Linsley
Pinyuan Feng
Thibaut Boissin
A. Ashok
Thomas Fel
Stephanie Olaiya
Thomas Serre
AAML
33
6
0
05 Jun 2023
ProTeCt: Prompt Tuning for Taxonomic Open Set Classification
Tz-Ying Wu
Chih-Hui Ho
Nuno Vasconcelos
VLM
10
5
0
04 Jun 2023
Content-aware Token Sharing for Efficient Semantic Segmentation with Vision Transformers
Chenyang Lu
Daan de Geus
Gijs Dubbelman
ViT
27
20
0
03 Jun 2023
Unsupervised Low Light Image Enhancement Using SNR-Aware Swin Transformer
Zhijian Luo
Jiahui Tang
Yueen Hou
Zihan Huang
Yanzeng Gao
ViT
44
1
0
03 Jun 2023
DocFormerv2: Local Features for Document Understanding
Srikar Appalaraju
Peng Tang
Qi Dong
Nishant Sankaran
Yichu Zhou
R. Manmatha
36
39
0
02 Jun 2023
Transformer-based Annotation Bias-aware Medical Image Segmentation
Zehui Liao
Yutong Xie
Shishuai Hu
Yong-quan Xia
MedIm
24
8
0
02 Jun 2023
Dynamic Sparsity Is Channel-Level Sparsity Learner
Lu Yin
Gen Li
Meng Fang
Lijuan Shen
Tianjin Huang
Zhangyang Wang
Vlado Menkovski
Xiaolong Ma
Mykola Pechenizkiy
Shiwei Liu
33
20
0
30 May 2023
Are Large Kernels Better Teachers than Transformers for ConvNets?
Tianjin Huang
Lu Yin
Zhenyu Zhang
Lijuan Shen
Meng Fang
Mykola Pechenizkiy
Zhangyang Wang
Shiwei Liu
38
13
0
30 May 2023
Image Quality Is Not All You Want: Task-Driven Lens Design for Image Classification
Xinge Yang
Qiang Fu
Yunfeng Nie
Wolfgang Heidrich
VLM
29
7
0
26 May 2023
Previous
1
2
3
...
10
11
12
...
15
16
17
Next