Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.09883
Cited By
Swin Transformer V2: Scaling Up Capacity and Resolution
18 November 2021
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
Yixuan Wei
Jia Ning
Yue Cao
Zheng-Wei Zhang
Li Dong
Furu Wei
B. Guo
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Swin Transformer V2: Scaling Up Capacity and Resolution"
50 / 823 papers shown
Title
The Curious Case of Benign Memorization
Sotiris Anagnostidis
Gregor Bachmann
Lorenzo Noci
Thomas Hofmann
AAML
49
8
0
25 Oct 2022
BARS: A Benchmark for Airport Runway Segmentation
Wenhui Chen
Zhijiang Zhang
Liang Yu
Yichun Tai
19
11
0
24 Oct 2022
S2WAT: Image Style Transfer via Hierarchical Vision Transformer using Strips Window Attention
Chi Zhang
Lu Zhou
Lei Wang
Zaiyan Dai
Jun Yang
ViT
34
23
0
22 Oct 2022
Accumulated Trivial Attention Matters in Vision Transformers on Small Datasets
Xiangyu Chen
Qinghao Hu
Kaidong Li
Cuncong Zhong
Guanghui Wang
ViT
38
11
0
22 Oct 2022
A Unified View of Masked Image Modeling
Zhiliang Peng
Li Dong
Hangbo Bao
QiXiang Ye
Furu Wei
VLM
54
35
0
19 Oct 2022
A Tri-Layer Plugin to Improve Occluded Detection
Guanqi Zhan
Weidi Xie
Andrew Zisserman
24
20
0
18 Oct 2022
Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for Text-to-Image Generation
Rui Li
Weihua Li
Yi Yang
Hanyu Wei
Jianhua Jiang
Quan-wei Bai
DiffM
27
11
0
18 Oct 2022
Token Merging: Your ViT But Faster
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Christoph Feichtenhofer
Judy Hoffman
MoMe
51
422
0
17 Oct 2022
2nd Place Solution to Google Universal Image Embedding
Xiaolong Huang
Qiankun Li
SSL
32
2
0
17 Oct 2022
Probabilistic Integration of Object Level Annotations in Chest X-ray Classification
Tom van Sonsbeek
Xiantong Zhen
Dwarikanath Mahapatra
M. Worring
31
12
0
13 Oct 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
31
47
0
13 Oct 2022
S4ND: Modeling Images and Videos as Multidimensional Signals Using State Spaces
Eric N. D. Nguyen
Karan Goel
Albert Gu
Gordon W. Downs
Preey Shah
Tri Dao
S. Baccus
Christopher Ré
VLM
22
39
0
12 Oct 2022
How Much Data Are Augmentations Worth? An Investigation into Scaling Laws, Invariance, and Implicit Regularization
Jonas Geiping
Micah Goldblum
Gowthami Somepalli
Ravid Shwartz-Ziv
Tom Goldstein
A. Wilson
26
35
0
12 Oct 2022
Match Cutting: Finding Cuts with Smooth Visual Transitions
Boris Chen
Amir Ziai
Rebecca Tucker
Yuchen Xie
VGen
28
14
0
11 Oct 2022
Curved Representation Space of Vision Transformers
Juyeop Kim
Junha Park
Songkuk Kim
Jongseok Lee
ViT
38
6
0
11 Oct 2022
Rethinking the Detection Head Configuration for Traffic Object Detection
Yi Shi
Jiang Wu
Shixuan Zhao
Gangyao Gao
T. Deng
Hongmei Yan
ObjD
24
5
0
08 Oct 2022
Humans need not label more humans: Occlusion Copy & Paste for Occluded Human Instance Segmentation
Evan Ling
De-Kai Huang
Minhoe Hur
27
5
0
07 Oct 2022
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models
Chenglin Yang
Siyuan Qiao
Qihang Yu
Xiaoding Yuan
Yukun Zhu
Alan Yuille
Hartwig Adam
Liang-Chieh Chen
ViT
MoE
39
59
0
04 Oct 2022
Dual-former: Hybrid Self-attention Transformer for Efficient Image Restoration
Sixiang Chen
Tian-Chun Ye
Yun-Peng Liu
Erkang Chen
ViT
34
15
0
03 Oct 2022
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning
Weicong Liang
Yuhui Yuan
Henghui Ding
Xiao Luo
Weihong Lin
Ding Jia
Zheng-Wei Zhang
Chao Zhang
Hanhua Hu
35
25
0
03 Oct 2022
Dilated Neighborhood Attention Transformer
Ali Hassani
Humphrey Shi
ViT
MedIm
33
68
0
29 Sep 2022
Transfer Learning with Pretrained Remote Sensing Transformers
A. Fuller
K. Millard
J.R. Green
33
11
0
28 Sep 2022
Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restoration
Marcos V. Conde
Ui-Jin Choi
Maxime Burchi
Radu Timofte
ViT
59
135
0
22 Sep 2022
VINet: Visual and Inertial-based Terrain Classification and Adaptive Navigation over Unknown Terrain
Tianrui Guan
Ruitao Song
Zhixian Ye
Liangjun Zhang
48
10
0
16 Sep 2022
Communication-Efficient and Privacy-Preserving Feature-based Federated Transfer Learning
Feng Wang
M. C. Gursoy
Senem Velipasalar
19
2
0
12 Sep 2022
LRT: An Efficient Low-Light Restoration Transformer for Dark Light Field Images
Shansi Zhang
Nan Meng
E. Lam
ViT
47
20
0
06 Sep 2022
A Review of Sparse Expert Models in Deep Learning
W. Fedus
J. Dean
Barret Zoph
MoE
20
144
0
04 Sep 2022
AutoPET Challenge: Combining nn-Unet with Swin UNETR Augmented by Maximum Intensity Projection Classifier
Lars Heiliger
Zdravko Marinov
Max Hasin
André Ferreira
Jana Fragemann
...
D. Kersting
Victor Alves
Rainer Stiefelhagen
Jan Egger
Jens Kleesiek
27
9
0
02 Sep 2022
AIM 2022 Challenge on Super-Resolution of Compressed Image and Video: Dataset, Methods and Results
Ren Yang
Radu Timofte
Xin Li
Qi Zhang
Lin Zhang
...
Yijian Zhang
Mao Ye
Dengyan Luo
Xiaofeng Pan
L. Peng
SupR
53
30
0
23 Aug 2022
Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks
Wenhui Wang
Hangbo Bao
Li Dong
Johan Bjorck
Zhiliang Peng
...
Kriti Aggarwal
O. Mohammed
Saksham Singhal
Subhojit Som
Furu Wei
MLLM
VLM
ViT
54
629
0
22 Aug 2022
Conv-Adapter: Exploring Parameter Efficient Transfer Learning for ConvNets
Hao Chen
R. Tao
Han Zhang
Yidong Wang
Xiang Li
Weirong Ye
Jindong Wang
Guosheng Hu
Marios Savvides
VPVLM
32
53
0
15 Aug 2022
BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers
Zhiliang Peng
Li Dong
Hangbo Bao
QiXiang Ye
Furu Wei
29
306
0
12 Aug 2022
Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model
Di Wang
Qiming Zhang
Yufei Xu
Jing Zhang
Bo Du
Dacheng Tao
Lefei Zhang
36
242
0
08 Aug 2022
P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with Point-to-Pixel Prompting
Ziyi Wang
Xumin Yu
Yongming Rao
Jie Zhou
Jiwen Lu
VPVLM
VLM
24
75
0
04 Aug 2022
Unified Normalization for Accelerating and Stabilizing Transformers
Qiming Yang
Kai Zhang
Chaoxiang Lan
Zhi Yang
Zheyang Li
Wenming Tan
Jun Xiao
Shiliang Pu
17
8
0
02 Aug 2022
giMLPs: Gate with Inhibition Mechanism in MLPs
Cheng Kang
Jindich Prokop
Lei Tong
Huiyu Zhou
Yong Hu
Daneil Novak
29
0
0
01 Aug 2022
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions
Yongming Rao
Wenliang Zhao
Yansong Tang
Jie Zhou
Ser-Nam Lim
Jiwen Lu
ViT
22
251
0
28 Jul 2022
Visual Recognition by Request
Chufeng Tang
Lingxi Xie
Xiaopeng Zhang
Xiaolin Hu
Qi Tian
VLM
16
15
0
28 Jul 2022
PEA: Improving the Performance of ReLU Networks for Free by Using Progressive Ensemble Activations
Á. Utasi
35
0
0
28 Jul 2022
Multi-Forgery Detection Challenge 2022: Push the Frontier of Unconstrained and Diverse Forgery Detection
Jianshu Li
Man Luo
Jian Liu
Tao Chen
Chengjie Wang
...
Bo Liu
Mingyu Guo
Ying Guo
Y. Ao
Pengfei Gao
19
0
0
27 Jul 2022
Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment
Qiang Chen
Xiaokang Chen
Jian Wang
Shan Zhang
Kun Yao
Haocheng Feng
Junyu Han
Errui Ding
Gang Zeng
Jingdong Wang
ViT
49
120
0
26 Jul 2022
DETRs with Hybrid Matching
Ding Jia
Yuhui Yuan
Hao He
Xiao-pei Wu
Haojun Yu
Weihong Lin
Lei-huan Sun
Chao Zhang
Hanhua Hu
26
182
0
26 Jul 2022
Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering
Yang Liu
Guanbin Li
Liang Lin
LRM
36
80
0
26 Jul 2022
Dive into Big Model Training
Qinghua Liu
Yuxiang Jiang
MoMe
AI4CE
LRM
21
3
0
25 Jul 2022
Applying Spatiotemporal Attention to Identify Distracted and Drowsy Driving with Vision Transformers
Samay Lakhani
ViT
MedIm
19
1
0
22 Jul 2022
Efficient Graph-Friendly COCO Metric Computation for Train-Time Model Evaluation
Luke Wood
François Chollet
19
7
0
21 Jul 2022
TinyViT: Fast Pretraining Distillation for Small Vision Transformers
Kan Wu
Jinnian Zhang
Houwen Peng
Mengchen Liu
Bin Xiao
Jianlong Fu
Lu Yuan
ViT
21
246
0
21 Jul 2022
Vision Transformers: From Semantic Segmentation to Dense Prediction
Li Zhang
Jiachen Lu
Sixiao Zheng
Xinxuan Zhao
Xiatian Zhu
Yanwei Fu
Tao Xiang
Jianfeng Feng
Philip H. S. Torr
ViT
27
7
0
19 Jul 2022
Towards Trustworthy Healthcare AI: Attention-Based Feature Learning for COVID-19 Screening With Chest Radiography
Kai Ma
Pengcheng Xi
K. Habashy
Ashkan Ebadi
Stéphane Tremblay
Alexander Wong
ViT
MedIm
23
1
0
19 Jul 2022
MonoIndoor++:Towards Better Practice of Self-Supervised Monocular Depth Estimation for Indoor Environments
Runze Li
Pan Ji
Yi Tian Xu
B. Bhanu
MDE
21
22
0
18 Jul 2022
Previous
1
2
3
...
14
15
16
17
Next