ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.00652
  4. Cited By
CSWin Transformer: A General Vision Transformer Backbone with
  Cross-Shaped Windows
v1v2v3 (latest)

CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows

1 July 2021
Xiaoyi Dong
Jianmin Bao
Dongdong Chen
Weiming Zhang
Nenghai Yu
Lu Yuan
Dong Chen
B. Guo
    ViT
ArXiv (abs)PDFHTMLGithub (569★)

Papers citing "CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows"

40 / 440 papers shown
Title
Pyramid Fusion Transformer for Semantic Segmentation
Pyramid Fusion Transformer for Semantic Segmentation
Zipeng Qin
Jianbo Liu
Xiaoling Zhang
Maoqing Tian
Aojun Zhou
Shuai Yi
Hongsheng Li
ViT
86
16
0
11 Jan 2022
QuadTree Attention for Vision Transformers
QuadTree Attention for Vision Transformers
Shitao Tang
Jiahui Zhang
Siyu Zhu
Ping Tan
ViT
228
164
0
08 Jan 2022
Vision Transformer with Deformable Attention
Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
S. Song
Li Erran Li
Gao Huang
ViT
130
495
0
03 Jan 2022
Pale Transformer: A General Vision Transformer Backbone with Pale-Shaped
  Attention
Pale Transformer: A General Vision Transformer Backbone with Pale-Shaped Attention
Sitong Wu
Tianyi Wu
Hao Hao Tan
G. Guo
ViT
98
71
0
28 Dec 2021
Augmenting Convolutional networks with attention-based aggregation
Augmenting Convolutional networks with attention-based aggregation
Hugo Touvron
Matthieu Cord
Alaaeldin El-Nouby
Piotr Bojanowski
Armand Joulin
Gabriel Synnaeve
Hervé Jégou
ViT
119
49
0
27 Dec 2021
ELSA: Enhanced Local Self-Attention for Vision Transformer
ELSA: Enhanced Local Self-Attention for Vision Transformer
Jingkai Zhou
Pichao Wang
Fan Wang
Qiong Liu
Hao Li
Rong Jin
ViT
117
41
0
23 Dec 2021
SeMask: Semantically Masked Transformers for Semantic Segmentation
SeMask: Semantically Masked Transformers for Semantic Segmentation
Jitesh Jain
Anukriti Singh
Nikita Orlov
Zilong Huang
Jiachen Li
Steven Walton
Humphrey Shi
ViT
91
98
0
23 Dec 2021
MIA-Former: Efficient and Robust Vision Transformers via Multi-grained
  Input-Adaptation
MIA-Former: Efficient and Robust Vision Transformers via Multi-grained Input-Adaptation
Zhongzhi Yu
Y. Fu
Sicheng Li
Chaojian Li
Yingyan Lin
ViT
80
19
0
21 Dec 2021
StyleSwin: Transformer-based GAN for High-resolution Image Generation
StyleSwin: Transformer-based GAN for High-resolution Image Generation
Bo Zhang
Shuyang Gu
Bo Zhang
Jianmin Bao
Dong Chen
Fang Wen
Yong Wang
B. Guo
ViT
92
233
0
20 Dec 2021
On Efficient Transformer-Based Image Pre-training for Low-Level Vision
On Efficient Transformer-Based Image Pre-training for Low-Level Vision
Wenbo Li
Xin Lu
Shengju Qian
Jiangbo Lu
Xinming Zhang
Jiaya Jia
ViT
140
88
0
19 Dec 2021
BEVT: BERT Pretraining of Video Transformers
BEVT: BERT Pretraining of Video Transformers
Rui Wang
Dongdong Chen
Zuxuan Wu
Yinpeng Chen
Xiyang Dai
Mengchen Liu
Yu-Gang Jiang
Luowei Zhou
Lu Yuan
ViT
115
209
0
02 Dec 2021
MViTv2: Improved Multiscale Vision Transformers for Classification and
  Detection
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
172
702
0
02 Dec 2021
On the Integration of Self-Attention and Convolution
On the Integration of Self-Attention and Convolution
Xuran Pan
Chunjiang Ge
Rui Lu
S. Song
Guanfu Chen
Zeyi Huang
Gao Huang
SSL
139
308
0
29 Nov 2021
Global Interaction Modelling in Vision Transformer via Super Tokens
Global Interaction Modelling in Vision Transformer via Super Tokens
Ammarah Farooq
Muhammad Awais
S. Ahmed
J. Kittler
ViT
59
7
0
25 Nov 2021
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
Xiaoyi Dong
Jianmin Bao
Ting Zhang
Dongdong Chen
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
Baining Guo
ViT
153
246
0
24 Nov 2021
Self-slimmed Vision Transformer
Self-slimmed Vision Transformer
Zhuofan Zong
Kunchang Li
Guanglu Song
Yali Wang
Yu Qiao
B. Leng
Yu Liu
ViT
108
32
0
24 Nov 2021
MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal
  Representation Learning
MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning
David Junhao Zhang
Kunchang Li
Yali Wang
Yuxiang Chen
Shashwat Chandra
Yu Qiao
Luoqi Liu
Mike Zheng Shou
AI4TS
100
30
0
24 Nov 2021
Florence: A New Foundation Model for Computer Vision
Florence: A New Foundation Model for Computer Vision
Lu Yuan
Dongdong Chen
Yi-Ling Chen
Noel Codella
Xiyang Dai
...
Zhen Xiao
Jianwei Yang
Michael Zeng
Luowei Zhou
Pengchuan Zhang
VLM
202
907
0
22 Nov 2021
TransMorph: Transformer for unsupervised medical image registration
TransMorph: Transformer for unsupervised medical image registration
Junyu Chen
Eric C. Frey
Yufan He
W. Paul Segars
Ye Li
Yong Du
ViTMedIm
223
336
0
19 Nov 2021
Swin Transformer V2: Scaling Up Capacity and Resolution
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
ViT
301
1,843
0
18 Nov 2021
INTERN: A New Learning Paradigm Towards General Vision
INTERN: A New Learning Paradigm Towards General Vision
Jing Shao
Siyu Chen
Yangguang Li
Kun Wang
Zhen-fei Yin
...
F. Yu
Junjie Yan
Dahua Lin
Xiaogang Wang
Yu Qiao
110
34
0
16 Nov 2021
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Hai-Tao Zheng
Li Tao
Dun Liang
Haitao Zheng
217
100
0
07 Nov 2021
Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation
Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation
Jiaqi Gu
Hyoukjun Kwon
Dilin Wang
Wei Ye
Meng Li
Yu-Hsin Chen
Liangzhen Lai
Vikas Chandra
David Z. Pan
ViT
69
188
0
01 Nov 2021
M2MRF: Many-to-Many Reassembly of Features for Tiny Lesion Segmentation
  in Fundus Images
M2MRF: Many-to-Many Reassembly of Features for Tiny Lesion Segmentation in Fundus Images
Qing Liu
Haotian Liu
Wei Ke
Yixiong Liang
57
6
0
30 Oct 2021
2nd Place Solution to Google Landmark Recognition Competition 2021
2nd Place Solution to Google Landmark Recognition Competition 2021
Shubin Dai
3DVViT
52
4
0
06 Oct 2021
Ripple Attention for Visual Perception with Sub-quadratic Complexity
Ripple Attention for Visual Perception with Sub-quadratic Complexity
Lin Zheng
Huijie Pan
Lingpeng Kong
72
3
0
06 Oct 2021
Deep Instance Segmentation with Automotive Radar Detection Points
Deep Instance Segmentation with Automotive Radar Detection Points
Tao Huang
Weiyi Xiong
Liping Bai
Yu Xia
Wei Chen
Wanli Ouyang
Bing Zhu
173
55
0
05 Oct 2021
Long-Range Transformers for Dynamic Spatiotemporal Forecasting
Long-Range Transformers for Dynamic Spatiotemporal Forecasting
J. E. Grigsby
Zhe Wang
Nam Nguyen
Yanjun Qi
AI4TS
121
95
0
24 Sep 2021
Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?
Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?
Chuanxin Tang
Yucheng Zhao
Guangting Wang
Chong Luo
Wenxuan Xie
Wenjun Zeng
MoEViT
86
101
0
12 Sep 2021
Exploiting Spatial-Temporal Semantic Consistency for Video Scene Parsing
Exploiting Spatial-Temporal Semantic Consistency for Video Scene Parsing
Xingjian He
Weining Wang
Zhiyong Xu
Hao Wang
Jie Jiang
Jing Liu
63
2
0
06 Sep 2021
Trans4Trans: Efficient Transformer for Transparent Object and Semantic
  Scene Segmentation in Real-World Navigation Assistance
Trans4Trans: Efficient Transformer for Transparent Object and Semantic Scene Segmentation in Real-World Navigation Assistance
Jiaming Zhang
Kailun Yang
Angela Constantinescu
Kunyu Peng
Karin Muller
Rainer Stiefelhagen
ViT
88
69
0
20 Aug 2021
Mobile-Former: Bridging MobileNet and Transformer
Mobile-Former: Bridging MobileNet and Transformer
Yinpeng Chen
Xiyang Dai
Dongdong Chen
Mengchen Liu
Xiaoyi Dong
Lu Yuan
Zicheng Liu
ViT
287
494
0
12 Aug 2021
S$^2$-MLPv2: Improved Spatial-Shift MLP Architecture for Vision
S2^22-MLPv2: Improved Spatial-Shift MLP Architecture for Vision
Tan Yu
Xu Li
Yunfeng Cai
Mingming Sun
Ping Li
80
54
0
02 Aug 2021
More than Encoder: Introducing Transformer Decoder to Upsample
More than Encoder: Introducing Transformer Decoder to Upsample
Yijiang Li
Wentian Cai
Ying Gao
Chengming Li
Xiping Hu
ViTMedIm
82
55
0
20 Jun 2021
Uformer: A General U-Shaped Transformer for Image Restoration
Uformer: A General U-Shaped Transformer for Image Restoration
Zhendong Wang
Xiaodong Cun
Jianmin Bao
Wengang Zhou
Jianzhuang Liu
Houqiang Li
ViT
222
1,438
0
06 Jun 2021
MSG-Transformer: Exchanging Local Spatial Information by Manipulating
  Messenger Tokens
MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens
Jiemin Fang
Lingxi Xie
Xinggang Wang
Xiaopeng Zhang
Wenyu Liu
Qi Tian
ViT
73
78
0
31 May 2021
Visformer: The Vision-friendly Transformer
Visformer: The Vision-friendly Transformer
Zhengsu Chen
Lingxi Xie
Jianwei Niu
Xuefeng Liu
Longhui Wei
Qi Tian
ViT
215
223
0
26 Apr 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
418
2,570
0
04 Jan 2021
A Survey on Visual Transformer
A Survey on Visual Transformer
Kai Han
Yunhe Wang
Hanting Chen
Xinghao Chen
Jianyuan Guo
...
Chunjing Xu
Yixing Xu
Zhaohui Yang
Yiman Zhang
Dacheng Tao
ViT
237
2,294
0
23 Dec 2020
Xception: Deep Learning with Depthwise Separable Convolutions
Xception: Deep Learning with Depthwise Separable Convolutions
François Chollet
MDEBDLPINN
1.6K
14,698
0
07 Oct 2016
Previous
123456789