ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.00652
  4. Cited By
CSWin Transformer: A General Vision Transformer Backbone with
  Cross-Shaped Windows
v1v2v3 (latest)

CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows

1 July 2021
Xiaoyi Dong
Jianmin Bao
Dongdong Chen
Weiming Zhang
Nenghai Yu
Lu Yuan
Dong Chen
B. Guo
    ViT
ArXiv (abs)PDFHTMLGithub (569★)

Papers citing "CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows"

50 / 440 papers shown
Title
Window Attention is Bugged: How not to Interpolate Position Embeddings
Window Attention is Bugged: How not to Interpolate Position Embeddings
Daniel Bolya
Chaitanya K. Ryali
Judy Hoffman
Christoph Feichtenhofer
94
11
0
09 Nov 2023
GPT-ST: Generative Pre-Training of Spatio-Temporal Graph Neural Networks
GPT-ST: Generative Pre-Training of Spatio-Temporal Graph Neural Networks
Zhonghang Li
Lianghao Xia
Yong-mei Xu
Chao Huang
AI4TSAI4CE
128
28
0
07 Nov 2023
GTP-ViT: Efficient Vision Transformers via Graph-based Token Propagation
GTP-ViT: Efficient Vision Transformers via Graph-based Token Propagation
Xuwei Xu
Sen Wang
Yudong Chen
Yanping Zheng
Zhewei Wei
Jiajun Liu
ViT
106
12
0
06 Nov 2023
Scattering Vision Transformer: Spectral Mixing Matters
Scattering Vision Transformer: Spectral Mixing Matters
Badri N. Patro
Vijay Srinivas Agneeswaran
114
15
0
02 Nov 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
170
41
0
30 Oct 2023
Gramian Attention Heads are Strong yet Efficient Vision Learners
Gramian Attention Heads are Strong yet Efficient Vision Learners
Jongbin Ryu
Dongyoon Han
J. Lim
102
2
0
25 Oct 2023
Heuristic Vision Pre-Training with Self-Supervised and Supervised
  Multi-Task Learning
Heuristic Vision Pre-Training with Self-Supervised and Supervised Multi-Task Learning
Zhiming Qian
VLMSSL
56
0
0
11 Oct 2023
Plug n' Play: Channel Shuffle Module for Enhancing Tiny Vision
  Transformers
Plug n' Play: Channel Shuffle Module for Enhancing Tiny Vision Transformers
Xuwei Xu
Sen Wang
Yudong Chen
Jiajun Liu
ViT
59
1
0
09 Oct 2023
Hierarchical Side-Tuning for Vision Transformers
Hierarchical Side-Tuning for Vision Transformers
Weifeng Lin
Ziheng Wu
Wentao Yang
Mingxin Huang
Jun Huang
Lianwen Jin
121
8
0
09 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Low-Resolution Self-Attention for Semantic Segmentation
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
233
3
0
08 Oct 2023
TiC: Exploring Vision Transformer in Convolution
TiC: Exploring Vision Transformer in Convolution
Song Zhang
Qingzhong Wang
Jiang Bian
Haoyi Xiong
ViT
50
1
0
06 Oct 2023
A Complementary Global and Local Knowledge Network for Ultrasound
  denoising with Fine-grained Refinement
A Complementary Global and Local Knowledge Network for Ultrasound denoising with Fine-grained Refinement
Zhenyu Bu
Kaini Wang
Fuxing Zhao
Shengxiao Li
Guangquan Zhou
39
0
0
05 Oct 2023
Multiple Physics Pretraining for Physical Surrogate Models
Multiple Physics Pretraining for Physical Surrogate Models
Michael McCabe
Bruno Régaldo-Saint Blancard
Liam Parker
Ruben Ohana
M. Cranmer
...
Francois Lanusse
Mariel Pettee
Tiberiu Teşileanu
Kyunghyun Cho
Shirley Ho
PINNAI4CE
110
56
0
04 Oct 2023
TransRadar: Adaptive-Directional Transformer for Real-Time Multi-View
  Radar Semantic Segmentation
TransRadar: Adaptive-Directional Transformer for Real-Time Multi-View Radar Semantic Segmentation
Yahia Dalbah
Jean Lahoud
Hisham Cholakkal
77
9
0
03 Oct 2023
When Epipolar Constraint Meets Non-local Operators in Multi-View Stereo
When Epipolar Constraint Meets Non-local Operators in Multi-View Stereo
Tianqi Liu
Xinyi Ye
Weiyue Zhao
Zhiyu Pan
Min Shi
Zhiguo Cao
83
14
0
29 Sep 2023
CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and
  Favorable Transferability For ViTs
CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and Favorable Transferability For ViTs
Ao Wang
Hui Chen
Zijia Lin
Sicheng Zhao
Jiawei Han
Guiguang Ding
ViT
58
6
0
27 Sep 2023
UniHead: Unifying Multi-Perception for Detection Heads
UniHead: Unifying Multi-Perception for Detection Heads
Hantao Zhou
Rui Yang
Yachao Zhang
Haoran Duan
Yawen Huang
R. Hu
Xiu Li
Yefeng Zheng
105
13
0
23 Sep 2023
DualToken-ViT: Position-aware Efficient Vision Transformer with Dual
  Token Fusion
DualToken-ViT: Position-aware Efficient Vision Transformer with Dual Token Fusion
Zhenzhen Chu
Jiayu Chen
Cen Chen
Chengyu Wang
Ziheng Wu
Jun Huang
Weining Qian
ViT
60
3
0
21 Sep 2023
RMT: Retentive Networks Meet Vision Transformers
RMT: Retentive Networks Meet Vision Transformers
Qihang Fan
Huaibo Huang
Mingrui Chen
Hongmin Liu
Ran He
ViT
168
91
0
20 Sep 2023
Multi-Context Dual Hyper-Prior Neural Image Compression
Multi-Context Dual Hyper-Prior Neural Image Compression
Atefeh Khoshkhahtinat
Ali Zafari
P. Mehta
Mohammad Akyash
Hossein Kashiani
Nasser M. Nasrabadi
78
6
0
19 Sep 2023
SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient
  Channels
SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels
Henry Hengyuan Zhao
Pichao Wang
Yuyang Zhao
Hao Luo
F. Wang
Mike Zheng Shou
ViT
115
14
0
15 Sep 2023
Dataset Condensation via Generative Model
Dataset Condensation via Generative Model
David Junhao Zhang
Heng Wang
Chuhui Xue
Rui Yan
Wenqing Zhang
Song Bai
Mike Zheng Shou
DD
62
13
0
14 Sep 2023
Hydra: Multi-head Low-rank Adaptation for Parameter Efficient
  Fine-tuning
Hydra: Multi-head Low-rank Adaptation for Parameter Efficient Fine-tuning
Sanghyeon Kim
Hyunmo Yang
Younghyun Kim
Youngjoon Hong
Eunbyung Park
AI4CE
73
18
0
13 Sep 2023
HAT: Hybrid Attention Transformer for Image Restoration
HAT: Hybrid Attention Transformer for Image Restoration
Xiangyu Chen
Xintao Wang
Wenlong Zhang
Xiangtao Kong
Yu Qiao
Jiantao Zhou
Chao Dong
101
53
0
11 Sep 2023
Mask-Attention-Free Transformer for 3D Instance Segmentation
Mask-Attention-Free Transformer for 3D Instance Segmentation
Xin Lai
Yuhui Yuan
Ruihang Chu
Yukang Chen
Han Hu
Jiaya Jia
MedImISeg3DPC
100
31
0
04 Sep 2023
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
Shiji Song
Li Erran Li
Gao Huang
ViT
93
27
0
04 Sep 2023
RevColV2: Exploring Disentangled Representations in Masked Image
  Modeling
RevColV2: Exploring Disentangled Representations in Masked Image Modeling
Qi Han
Yuxuan Cai
Xiangyu Zhang
123
8
0
02 Sep 2023
MB-TaylorFormer: Multi-branch Efficient Transformer Expanded by Taylor
  Formula for Image Dehazing
MB-TaylorFormer: Multi-branch Efficient Transformer Expanded by Taylor Formula for Image Dehazing
Yuwei Qiu
Kaihao Zhang
Chenxi Wang
Wenhan Luo
Hongdong Li
Zhi Jin
ViT
80
104
0
27 Aug 2023
Computation-efficient Deep Learning for Computer Vision: A Survey
Computation-efficient Deep Learning for Computer Vision: A Survey
Yulin Wang
Yizeng Han
Chaofei Wang
Shiji Song
Qi Tian
Gao Huang
VLM
134
21
0
27 Aug 2023
Vision Transformer Adapters for Generalizable Multitask Learning
Vision Transformer Adapters for Generalizable Multitask Learning
Deblina Bhattacharjee
Sabine Süsstrunk
Mathieu Salzmann
ViT
91
8
0
23 Aug 2023
SG-Former: Self-guided Transformer with Evolving Token Reallocation
SG-Former: Self-guided Transformer with Evolving Token Reallocation
Sucheng Ren
Xingyi Yang
Songhua Liu
Xinchao Wang
ViT
88
43
0
23 Aug 2023
Long-Range Grouping Transformer for Multi-View 3D Reconstruction
Long-Range Grouping Transformer for Multi-View 3D Reconstruction
Liying Yang
Zhenwei Zhu
Xuxin Lin
Jian Nong
Yanyan Liang
ViT
73
7
0
17 Aug 2023
SCSC: Spatial Cross-scale Convolution Module to Strengthen both CNNs and
  Transformers
SCSC: Spatial Cross-scale Convolution Module to Strengthen both CNNs and Transformers
Xijun Wang
Xiaojie Chu
Chunrui Han
Xiangyu Zhang
ViT
65
1
0
14 Aug 2023
Revisiting Vision Transformer from the View of Path Ensemble
Revisiting Vision Transformer from the View of Path Ensemble
Shuning Chang
Pichao Wang
Haowen Luo
Fan Wang
Mike Zheng Shou
ViT
66
3
0
12 Aug 2023
Vision Backbone Enhancement via Multi-Stage Cross-Scale Attention
Vision Backbone Enhancement via Multi-Stage Cross-Scale Attention
Liang Shang
Yanli Liu
Zhengyang Lou
Shuxue Quan
N. Adluru
Bochen Guan
W. Sethares
106
2
0
10 Aug 2023
Deformable Mixer Transformer with Gating for Multi-Task Learning of
  Dense Prediction
Deformable Mixer Transformer with Gating for Multi-Task Learning of Dense Prediction
Yangyang Xu
Yibo Yang
Bernard Ghanemm
Lefei Zhang
Du Bo
Dacheng Tao
110
1
0
10 Aug 2023
PVG: Progressive Vision Graph for Vision Recognition
PVG: Progressive Vision Graph for Vision Recognition
Jiafu Wu
Jian Li
Jiangning Zhang
Boshen Zhang
M. Chi
Yabiao Wang
Chengjie Wang
ViT
124
15
0
01 Aug 2023
FLatten Transformer: Vision Transformer using Focused Linear Attention
FLatten Transformer: Vision Transformer using Focused Linear Attention
Dongchen Han
Xuran Pan
Yizeng Han
Shiji Song
Gao Huang
109
181
0
01 Aug 2023
A survey on deep learning in medical image registration: new
  technologies, uncertainty, evaluation metrics, and beyond
A survey on deep learning in medical image registration: new technologies, uncertainty, evaluation metrics, and beyond
Junyu Chen
Yihao Liu
Shuwen Wei
Zhangxing Bian
Shalini Subramanian
A. Carass
Jerry L. Prince
Yong Du
OOD
119
46
0
28 Jul 2023
Adaptive Segmentation Network for Scene Text Detection
Adaptive Segmentation Network for Scene Text Detection
Gui-yan Zhao
SSeg
68
1
0
27 Jul 2023
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Muhammad Awais
Muzammal Naseer
Salman Khan
Rao Muhammad Anwer
Hisham Cholakkal
M. Shah
Ming-Hsuan Yang
Fahad Shahbaz Khan
VLM
148
128
0
25 Jul 2023
A Good Student is Cooperative and Reliable: CNN-Transformer
  Collaborative Learning for Semantic Segmentation
A Good Student is Cooperative and Reliable: CNN-Transformer Collaborative Learning for Semantic Segmentation
Jinjing Zhu
Yuan Luo
Xueye Zheng
Hao Wang
Lin Wang
65
35
0
24 Jul 2023
As large as it gets: Learning infinitely large Filters via Neural
  Implicit Functions in the Fourier Domain
As large as it gets: Learning infinitely large Filters via Neural Implicit Functions in the Fourier Domain
Julia Grabinski
J. Keuper
Margret Keuper
59
7
0
19 Jul 2023
RepViT: Revisiting Mobile CNN From ViT Perspective
RepViT: Revisiting Mobile CNN From ViT Perspective
Ao Wang
Hui Chen
Zijia Lin
Hengjun Pu
Guiguang Ding
105
219
0
18 Jul 2023
Vision Language Transformers: A Survey
Vision Language Transformers: A Survey
Clayton Fields
C. Kennington
VLM
63
5
0
06 Jul 2023
Art Authentication with Vision Transformers
Art Authentication with Vision Transformers
Ludovica Schaerf
Carina Popovici
Eric Postma
ViT
67
11
0
06 Jul 2023
MSViT: Dynamic Mixed-Scale Tokenization for Vision Transformers
MSViT: Dynamic Mixed-Scale Tokenization for Vision Transformers
Jakob Drachmann Havtorn
Amelie Royer
Tijmen Blankevoort
B. Bejnordi
83
8
0
05 Jul 2023
X-MLP: A Patch Embedding-Free MLP Architecture for Vision
X-MLP: A Patch Embedding-Free MLP Architecture for Vision
Xinyue Wang
Zhicheng Cai
Chenglei Peng
ViT
95
5
0
02 Jul 2023
TaCA: Upgrading Your Visual Foundation Model with Task-agnostic
  Compatible Adapter
TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter
Binjie Zhang
Yixiao Ge
Xuyuan Xu
Ying Shan
Mike Zheng Shou
97
8
0
22 Jun 2023
Reviving Shift Equivariance in Vision Transformers
Reviving Shift Equivariance in Vision Transformers
Peijian Ding
Davit Soselia
Thomas Armstrong
Jiahao Su
Furong Huang
96
7
0
13 Jun 2023
Previous
123456789
Next