Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.09883
Cited By
Swin Transformer V2: Scaling Up Capacity and Resolution
18 November 2021
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
Yixuan Wei
Jia Ning
Yue Cao
Zheng-Wei Zhang
Li Dong
Furu Wei
B. Guo
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Swin Transformer V2: Scaling Up Capacity and Resolution"
50 / 824 papers shown
Title
Probabilistic Image-Driven Traffic Modeling via Remote Sensing
Scott Workman
Armin Hadzic
34
0
0
08 Mar 2024
SDPL: Shifting-Dense Partition Learning for UAV-View Geo-Localization
Quan Chen
Tingyu Wang
Zihao Yang
Haoran Li
Rongfeng Lu
Yaoqi Sun
Bolun Zheng
Chenggang Yan
42
14
0
07 Mar 2024
xT: Nested Tokenization for Larger Context in Large Images
Ritwik Gupta
Shufan Li
Tyler Lixuan Zhu
Jitendra Malik
Trevor Darrell
K. Mangalam
ViT
45
4
0
04 Mar 2024
SeD: Semantic-Aware Discriminator for Image Super-Resolution
Bingchen Li
Xin Li
Hanxin Zhu
Yeying Jin
Ruoyu Feng
Zhizheng Zhang
Zhibo Chen
SupR
48
22
0
29 Feb 2024
CAMixerSR: Only Details Need More "Attention"
Yan Wang
Yi Liu
Shijie Zhao
Junlin Li
Li Zhang
SupR
52
17
0
29 Feb 2024
Effective Message Hiding with Order-Preserving Mechanisms
Yu Gao
Xuchong Qiu
Zihan Ye
35
0
0
29 Feb 2024
Mixer is more than just a model
Qingfeng Ji
Yuxin Wang
Letong Sun
43
0
0
28 Feb 2024
State Space Models for Event Cameras
Nikola Zubić
Mathias Gehrig
Davide Scaramuzza
65
40
0
23 Feb 2024
YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
Chien-Yao Wang
I-Hau Yeh
Hongpeng Liao
62
1,169
0
21 Feb 2024
TransGOP: Transformer-Based Gaze Object Prediction
Binglu Wang
Chenxi Guo
Yang Jin
Haisheng Xia
Nian Liu
41
4
0
21 Feb 2024
LangXAI: Integrating Large Vision Models for Generating Textual Explanations to Enhance Explainability in Visual Perception Tasks
Truong Thanh Hung Nguyen
Tobias Clement
Phuc Truong Loc Nguyen
Nils Kemmerzell
Van Binh Truong
V. Nguyen
Mohamed Abdelaal
Hung Cao
VLM
44
8
0
19 Feb 2024
Stealing the Invisible: Unveiling Pre-Trained CNN Models through Adversarial Examples and Timing Side-Channels
Shubhi Shukla
Manaar Alam
Pabitra Mitra
Debdeep Mukhopadhyay
MLAU
AAML
42
1
0
19 Feb 2024
AYDIV: Adaptable Yielding 3D Object Detection via Integrated Contextual Vision Transformer
Tanmoy Dam
Sanjay Bhargav Dharavath
Sameer Alam
Nimrod Lilith
Supriyo Chakraborty
Mir Feroskhan
40
3
0
12 Feb 2024
Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation
Ziyang Wang
Jian-Qing Zheng
Yichi Zhang
Ge Cui
Lei Li
Mamba
40
129
0
07 Feb 2024
Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers
Abhimanyu Bambhaniya
Amir Yazdanbakhsh
Suvinay Subramanian
Sheng-Chun Kao
Shivani Agrawal
Utku Evci
Tushar Krishna
59
14
0
07 Feb 2024
Neural Networks Learn Statistics of Increasing Complexity
Nora Belrose
Quintin Pope
Lucia Quirke
Alex Troy Mallen
Xiaoli Z. Fern
20
11
0
06 Feb 2024
SISP: A Benchmark Dataset for Fine-grained Ship Instance Segmentation in Panchromatic Satellite Images
Pengming Feng
Mingjie Xie
Hongning Liu
Xuanjia Zhao
Guangjun He
Xueliang Zhang
Jian Guan
27
1
0
06 Feb 2024
CoFiNet: Unveiling Camouflaged Objects with Multi-Scale Finesse
Cunhan Guo
Heyan Huang
22
3
0
03 Feb 2024
Bass Accompaniment Generation via Latent Diffusion
Marco Pasini
M. Grachten
Stefan Lattner
59
11
0
02 Feb 2024
A Manifold Representation of the Key in Vision Transformers
Li Meng
Morten Goodwin
Anis Yazidi
P. Engelstad
29
0
0
01 Feb 2024
SimAda: A Simple Unified Framework for Adapting Segment Anything Model in Underperformed Scenes
Yiran Song
Qianyu Zhou
Xuequan Lu
Zhiwen Shao
Lizhuang Ma
53
1
0
31 Jan 2024
Category-wise Fine-Tuning: Resisting Incorrect Pseudo-Labels in Multi-Label Image Classification with Partial Labels
Chak Fong Chong
Xinyi Fang
Jielong Guo
Yapeng Wang
Wei Ke
C. Lam
Sio-Kei Im
33
1
0
30 Jan 2024
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Bin Lin
Zhenyu Tang
Yang Ye
Jiaxi Cui
Bin Zhu
...
Jinfa Huang
Junwu Zhang
Yatian Pang
Munan Ning
Li-ming Yuan
VLM
MLLM
MoE
48
154
0
29 Jan 2024
VJT: A Video Transformer on Joint Tasks of Deblurring, Low-light Enhancement and Denoising
Yuxiang Hui
Yang Liu
Yaofang Liu
Fan Jia
Jinshan Pan
Raymond H. F. Chan
Tieyong Zeng
ViT
40
1
0
26 Jan 2024
CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process
Guan-Hong Chen
Yifan Shen
Zhenhao Chen
Xiangchen Song
Yuewen Sun
Weiran Yao
Xiao Liu
Kun Zhang
CML
34
7
0
25 Jan 2024
An open dataset for the evolution of oracle bone characters: EVOBC
Haisu Guan
Jinpeng Wan
Yuliang Liu
Pengjie Wang
Kaile Zhang
Zhebin Kuang
Xinyu Wang
Xiang Bai
Lianwen Jin
60
5
0
23 Jan 2024
AdaEmbed: Semi-supervised Domain Adaptation in the Embedding Space
A. Mottaghi
Mohammad Abdullah Jamal
Serena Yeung
Omid Mohareri
38
0
0
23 Jan 2024
OCT-SelfNet: A Self-Supervised Framework with Multi-Modal Datasets for Generalized and Robust Retinal Disease Detection
Fatema Jannat
Sina Gholami
Minha Alam
Hamed Tabkhi
33
1
0
22 Jan 2024
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
Katherine Crowson
Stefan Andreas Baumann
Alex Birch
Tanishq Mathew Abraham
Daniel Z. Kaplan
Enrico Shippole
34
49
0
21 Jan 2024
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Lihe Yang
Bingyi Kang
Zilong Huang
Xiaogang Xu
Jiashi Feng
Hengshuang Zhao
VLM
158
721
0
19 Jan 2024
AutoChunk: Automated Activation Chunk for Memory-Efficient Long Sequence Inference
Xuanlei Zhao
Shenggan Cheng
Guangyang Lu
Jiarui Fang
Hao Zhou
Bin Jia
Ziming Liu
Yang You
MQ
17
3
0
19 Jan 2024
Deep spatial context: when attention-based models meet spatial regression
Paulina Tomaszewska
El.zbieta Sienkiewicz
Mai P. Hoang
Przemysław Biecek
34
1
0
18 Jan 2024
Video Quality Assessment Based on Swin TransformerV2 and Coarse to Fine Strategy
Zihao Yu
Fengbin Guan
Yiting Lu
Xin Li
Zhibo Chen
ViT
37
6
0
16 Jan 2024
Transcending the Limit of Local Window: Advanced Super-Resolution Transformer with Adaptive Token Dictionary
Leheng Zhang
Yawei Li
Xingyu Zhou
Xiaorui Zhao
Shuhang Gu
SupR
29
27
0
16 Jan 2024
Discriminative Consensus Mining with A Thousand Groups for More Accurate Co-Salient Object Detection
Peng Zheng
39
0
0
15 Jan 2024
MapNeXt: Revisiting Training and Scaling Practices for Online Vectorized HD Map Construction
Toyota Li
26
6
0
14 Jan 2024
Transformer-CNN Fused Architecture for Enhanced Skin Lesion Segmentation
Siddharth Tiwari
MedIm
ViT
48
0
0
10 Jan 2024
Revisiting Adversarial Training at Scale
Zeyu Wang
Xianhang Li
Hongru Zhu
Cihang Xie
43
16
0
09 Jan 2024
GTA: Guided Transfer of Spatial Attention from Object-Centric Representations
SeokHyun Seo
Jinwoo Hong
Jungwoo Chae
Kyungyul Kim
Sangheum Hwang
40
0
0
05 Jan 2024
AG-ReID.v2: Bridging Aerial and Ground Views for Person Re-identification
Huy Nguyen
Kien Nguyen
Sridha Sridharan
Clinton Fookes
41
17
0
05 Jan 2024
Scaling and Masking: A New Paradigm of Data Sampling for Image and Video Quality Assessment
Yongxu Liu
Yinghui Quan
Guoyao Xiao
Aobo Li
Jinjian Wu
49
10
0
05 Jan 2024
BA-SAM: Scalable Bias-Mode Attention Mask for Segment Anything Model
Yiran Song
Qianyu Zhou
Hefei Ling
Deng-Ping Fan
Xuequan Lu
Lizhuang Ma
VLM
43
14
0
04 Jan 2024
Hybrid Pooling and Convolutional Network for Improving Accuracy and Training Convergence Speed in Object Detection
Shiwen Zhao
Wei Wang
Junhui Hou
Haihang Wu
ObjD
26
0
0
02 Jan 2024
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
Jiasen Lu
Christopher Clark
Sangho Lee
Zichen Zhang
Savya Khosla
Ryan Marten
Derek Hoiem
Aniruddha Kembhavi
VLM
MLLM
40
147
0
28 Dec 2023
TPTNet: A Data-Driven Temperature Prediction Model Based on Turbulent Potential Temperature
Jun Park
Changhoon Lee
33
1
0
22 Dec 2023
A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties
Junfei Xiao
Ziqi Zhou
Wenxuan Li
Shiyi Lan
Jieru Mei
Zhiding Yu
Alan Yuille
Yuyin Zhou
Cihang Xie
VLM
27
1
0
21 Dec 2023
Video Recognition in Portrait Mode
Mingfei Han
Linjie Yang
Xiaojie Jin
Jiashi Feng
Xiaojun Chang
Heng Wang
30
3
0
21 Dec 2023
Bootstrap Masked Visual Modeling via Hard Patches Mining
Haochen Wang
Junsong Fan
Yuxi Wang
Kaiyou Song
Tiancai Wang
Xiangyu Zhang
Zhaoxiang Zhang
47
5
0
21 Dec 2023
WeatherProof: A Paired-Dataset Approach to Semantic Segmentation in Adverse Weather
Blake Gella
Howard Zhang
Rishi Upadhyay
Tiffany Chang
Matthew Waliman
Yunhao Ba
Alex Wong
A. Kadambi
37
6
0
15 Dec 2023
EVP: Enhanced Visual Perception using Inverse Multi-Attentive Feature Refinement and Regularized Image-Text Alignment
M. Lavrenyuk
Shariq Farooq Bhat
Matthias Müller
Peter Wonka
ObjD
MDE
36
9
0
13 Dec 2023
Previous
1
2
3
...
6
7
8
...
15
16
17
Next