ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.00652
  4. Cited By
CSWin Transformer: A General Vision Transformer Backbone with
  Cross-Shaped Windows
v1v2v3 (latest)

CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows

1 July 2021
Xiaoyi Dong
Jianmin Bao
Dongdong Chen
Weiming Zhang
Nenghai Yu
Lu Yuan
Dong Chen
B. Guo
    ViT
ArXiv (abs)PDFHTMLGithub (569★)

Papers citing "CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows"

50 / 440 papers shown
Title
AdaptFormer: Adapting Vision Transformers for Scalable Visual
  Recognition
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition
Shoufa Chen
Chongjian Ge
Zhan Tong
Jiangliu Wang
Yibing Song
Jue Wang
Ping Luo
255
705
0
26 May 2022
Fast Vision Transformers with HiLo Attention
Fast Vision Transformers with HiLo Attention
Zizheng Pan
Jianfei Cai
Bohan Zhuang
67
168
0
26 May 2022
Inception Transformer
Inception Transformer
Chenyang Si
Weihao Yu
Pan Zhou
Yichen Zhou
Xinchao Wang
Shuicheng Yan
ViT
124
201
0
25 May 2022
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision
  Transformers with Locality
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality
Xiang Li
Wenhai Wang
Lingfeng Yang
Jian Yang
185
76
0
20 May 2022
TRT-ViT: TensorRT-oriented Vision Transformer
TRT-ViT: TensorRT-oriented Vision Transformer
Xin Xia
Jiashi Li
Jie Wu
Xing Wang
Xuefeng Xiao
Min Zheng
Rui Wang
ViT
64
28
0
19 May 2022
Vision Transformer Adapter for Dense Predictions
Vision Transformer Adapter for Dense Predictions
Zhe Chen
Yuchen Duan
Wenhai Wang
Junjun He
Tong Lu
Jifeng Dai
Yu Qiao
182
572
0
17 May 2022
Video Frame Interpolation with Transformer
Video Frame Interpolation with Transformer
Liying Lu
Ruizheng Wu
Huaijia Lin
Jiangbo Lu
Jiaya Jia
ViT
91
4
0
15 May 2022
Transformer Scale Gate for Semantic Segmentation
Transformer Scale Gate for Semantic Segmentation
Hengcan Shi
Munawar Hayat
Jianfei Cai
ViT
92
24
0
14 May 2022
Reduce Information Loss in Transformers for Pluralistic Image Inpainting
Reduce Information Loss in Transformers for Pluralistic Image Inpainting
Qiankun Liu
Zhentao Tan
Dongdong Chen
Qi Chu
Xiyang Dai
Yinpeng Chen
Mengchen Liu
Lu Yuan
Nenghai Yu
ViT
87
70
0
10 May 2022
Activating More Pixels in Image Super-Resolution Transformer
Activating More Pixels in Image Super-Resolution Transformer
Xiangyu Chen
Xintao Wang
Jiantao Zhou
Yu Qiao
Chao Dong
ViT
177
647
0
09 May 2022
Sequencer: Deep LSTM for Image Classification
Sequencer: Deep LSTM for Image Classification
Yuki Tatsunami
Masato Taki
VLMViT
80
82
0
04 May 2022
Coarse-to-Fine Video Denoising with Dual-Stage Spatial-Channel
  Transformer
Coarse-to-Fine Video Denoising with Dual-Stage Spatial-Channel Transformer
Wu Yun
Mengshi Qi
Chuanming Wang
Huiyuan Fu
Huadong Ma
ViT
90
6
0
30 Apr 2022
Deeper Insights into the Robustness of ViTs towards Common Corruptions
Deeper Insights into the Robustness of ViTs towards Common Corruptions
Rui Tian
Zuxuan Wu
Qi Dai
Han Hu
Yu-Gang Jiang
ViTAAML
104
6
0
26 Apr 2022
Residual Mixture of Experts
Residual Mixture of Experts
Lemeng Wu
Mengchen Liu
Yinpeng Chen
Dongdong Chen
Xiyang Dai
Lu Yuan
MoE
117
37
0
20 Apr 2022
VSA: Learning Varied-Size Window Attention in Vision Transformers
VSA: Learning Varied-Size Window Attention in Vision Transformers
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
100
57
0
18 Apr 2022
ResT V2: Simpler, Faster and Stronger
ResT V2: Simpler, Faster and Stronger
Qing-Long Zhang
Yubin Yang
ViT
68
26
0
15 Apr 2022
DeiT III: Revenge of the ViT
DeiT III: Revenge of the ViT
Hugo Touvron
Matthieu Cord
Hervé Jégou
ViT
134
418
0
14 Apr 2022
3D Shuffle-Mixer: An Efficient Context-Aware Vision Learner of
  Transformer-MLP Paradigm for Dense Prediction in Medical Volume
3D Shuffle-Mixer: An Efficient Context-Aware Vision Learner of Transformer-MLP Paradigm for Dense Prediction in Medical Volume
Jianye Pang
Cheng Jiang
Yihao Chen
Jianbo Chang
M. Feng
Renzhi Wang
Jianhua Yao
ViTMedIm
55
11
0
14 Apr 2022
DaViT: Dual Attention Vision Transformers
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
171
255
0
07 Apr 2022
Towards An End-to-End Framework for Flow-Guided Video Inpainting
Towards An End-to-End Framework for Flow-Guided Video Inpainting
Zerui Li
Cheng Lu
Jia Qin
Chunle Guo
Mingg-Ming Cheng
108
153
0
06 Apr 2022
MixFormer: Mixing Features across Windows and Dimensions
MixFormer: Mixing Features across Windows and Dimensions
Qiang Chen
Qiman Wu
Jian Wang
Qinghao Hu
T. Hu
Errui Ding
Jian Cheng
Jingdong Wang
MDEViT
88
109
0
06 Apr 2022
MaxViT: Multi-Axis Vision Transformer
MaxViT: Multi-Axis Vision Transformer
Zhengzhong Tu
Hossein Talebi
Han Zhang
Feng Yang
P. Milanfar
A. Bovik
Yinxiao Li
ViT
165
676
0
04 Apr 2022
Dynamic Focus-aware Positional Queries for Semantic Segmentation
Dynamic Focus-aware Positional Queries for Semantic Segmentation
Haoyu He
Jianfei Cai
Zizheng Pan
Jing Liu
Jing Zhang
Dacheng Tao
Bohan Zhuang
86
17
0
04 Apr 2022
Bringing Old Films Back to Life
Bringing Old Films Back to Life
Bo Liu
Bo Zhang
Dongdong Chen
Jing Liao
ViTVGen
69
43
0
31 Mar 2022
Deformable Video Transformer
Deformable Video Transformer
Jue Wang
Lorenzo Torresani
ViT
98
28
0
31 Mar 2022
MatteFormer: Transformer-Based Image Matting via Prior-Tokens
MatteFormer: Transformer-Based Image Matting via Prior-Tokens
Gyutae Park
S. Son
Jaeyoung Yoo
Seho Kim
Nojun Kwak
ViT
103
66
0
29 Mar 2022
SepViT: Separable Vision Transformer
SepViT: Separable Vision Transformer
Wei Li
Xing Wang
Xin Xia
Jie Wu
Jiashi Li
Xuefeng Xiao
Min Zheng
Shiping Wen
ViT
113
42
0
29 Mar 2022
Parameter-efficient Model Adaptation for Vision Transformers
Parameter-efficient Model Adaptation for Vision Transformers
Xuehai He
Chunyuan Li
Pengchuan Zhang
Jianwei Yang
Xinze Wang
79
91
0
29 Mar 2022
Stratified Transformer for 3D Point Cloud Segmentation
Stratified Transformer for 3D Point Cloud Segmentation
Xin Lai
Jianhui Liu
Li Jiang
Liwei Wang
Hengshuang Zhao
Shu Liu
Xiaojuan Qi
Jiaya Jia
3DPCViT
121
278
0
28 Mar 2022
Beyond Masking: Demystifying Token-Based Pre-Training for Vision
  Transformers
Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers
Yunjie Tian
Lingxi Xie
Jiemin Fang
Mengnan Shi
Junran Peng
Xiaopeng Zhang
Jianbin Jiao
Qi Tian
QiXiang Ye
83
20
0
27 Mar 2022
Transformers Meet Visual Learning Understanding: A Comprehensive Review
Transformers Meet Visual Learning Understanding: A Comprehensive Review
Yuting Yang
Licheng Jiao
Xuantong Liu
Fan Liu
Shuyuan Yang
Zhixi Feng
Xu Tang
ViTMedIm
120
28
0
24 Mar 2022
Focal Modulation Networks
Focal Modulation Networks
Jianwei Yang
Chunyuan Li
Xiyang Dai
Lu Yuan
Jianfeng Gao
3DPC
126
281
0
22 Mar 2022
ScalableViT: Rethinking the Context-oriented Generalization of Vision
  Transformer
ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer
Rui Yang
Hailong Ma
Jie Wu
Yansong Tang
Xuefeng Xiao
Min Zheng
Xiu Li
ViT
160
57
0
21 Mar 2022
V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision
  Transformer
V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer
Runsheng Xu
Hao Xiang
Zhengzhong Tu
Xin Xia
Ming-Hsuan Yang
Jiaqi Ma
ViT
228
385
0
20 Mar 2022
Patch-Fool: Are Vision Transformers Always Robust Against Adversarial Perturbations?
Patch-Fool: Are Vision Transformers Always Robust Against Adversarial Perturbations?
Y. Fu
Shunyao Zhang
Shan-Hung Wu
Cheng Wan
Yingyan Lin
AAML
122
67
0
16 Mar 2022
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
Xiaohan Ding
Xinming Zhang
Yi Zhou
Jungong Han
Guiguang Ding
Jian Sun
VLM
153
557
0
13 Mar 2022
Active Token Mixer
Active Token Mixer
Guoqiang Wei
Zhizheng Zhang
Cuiling Lan
Yan Lu
Zhibo Chen
57
16
0
11 Mar 2022
Visualizing and Understanding Patch Interactions in Vision Transformer
Visualizing and Understanding Patch Interactions in Vision Transformer
Jie Ma
Yalong Bai
Bineng Zhong
Wei Zhang
Ting Yao
Tao Mei
ViT
62
36
0
11 Mar 2022
Dynamic Group Transformer: A General Vision Transformer Backbone with
  Dynamic Group Attention
Dynamic Group Transformer: A General Vision Transformer Backbone with Dynamic Group Attention
Kai Liu
Tianyi Wu
Cong Liu
Guodong Guo
ViT
82
17
0
08 Mar 2022
Protecting Celebrities from DeepFake with Identity Consistency
  Transformer
Protecting Celebrities from DeepFake with Identity Consistency Transformer
Xiaoyi Dong
Jianmin Bao
Dongdong Chen
Ting Zhang
Weiming Zhang
Nenghai Yu
Dong Chen
Fang Wen
B. Guo
ViT
141
123
0
02 Mar 2022
TransKD: Transformer Knowledge Distillation for Efficient Semantic
  Segmentation
TransKD: Transformer Knowledge Distillation for Efficient Semantic Segmentation
R. Liu
Kailun Yang
Alina Roitberg
Jiaming Zhang
Kunyu Peng
Huayao Liu
Yaonan Wang
Rainer Stiefelhagen
ViT
91
38
0
27 Feb 2022
Towards an Analytical Definition of Sufficient Data
Towards an Analytical Definition of Sufficient Data
Adam Byerly
T. Kalganova
110
4
0
07 Feb 2022
Hydra: A Real-time Spatial Perception System for 3D Scene Graph
  Construction and Optimization
Hydra: A Real-time Spatial Perception System for 3D Scene Graph Construction and Optimization
Nathan Hughes
Yun Chang
Luca Carlone
3DPC
202
155
0
31 Jan 2022
BOAT: Bilateral Local Attention Vision Transformer
BOAT: Bilateral Local Attention Vision Transformer
Tan Yu
Gangming Zhao
Ping Li
Yizhou Yu
ViT
96
27
0
31 Jan 2022
UniFormer: Unifying Convolution and Self-attention for Visual
  Recognition
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li
Yali Wang
Junhao Zhang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
243
384
0
24 Jan 2022
Dual-Flattening Transformers through Decomposed Row and Column Queries
  for Semantic Segmentation
Dual-Flattening Transformers through Decomposed Row and Column Queries for Semantic Segmentation
Ying Wang
C. Ho
Wenju Xu
Ziwei Xuan
Xudong Liu
Guo-Jun Qi
ViT
45
5
0
22 Jan 2022
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient
  Long-Term Video Recognition
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition
Chao-Yuan Wu
Yanghao Li
K. Mangalam
Haoqi Fan
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
129
201
0
20 Jan 2022
SwinUNet3D -- A Hierarchical Architecture for Deep Traffic Prediction
  using Shifted Window Transformers
SwinUNet3D -- A Hierarchical Architecture for Deep Traffic Prediction using Shifted Window Transformers
Alabi Bojesomo
Hasan Al Marzouqi
P. Liatsis
ViT
56
6
0
17 Jan 2022
Spectral Compressive Imaging Reconstruction Using Convolution and
  Contextual Transformer
Spectral Compressive Imaging Reconstruction Using Convolution and Contextual Transformer
Lishun Wang
Zong-Jhe Wu
Yong Zhong
Xin Yuan
124
19
0
15 Jan 2022
UniFormer: Unified Transformer for Efficient Spatiotemporal
  Representation Learning
UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning
Kunchang Li
Yali Wang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
145
254
0
12 Jan 2022
Previous
123456789
Next