ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.00652
  4. Cited By
CSWin Transformer: A General Vision Transformer Backbone with
  Cross-Shaped Windows
v1v2v3 (latest)

CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows

1 July 2021
Xiaoyi Dong
Jianmin Bao
Dongdong Chen
Weiming Zhang
Nenghai Yu
Lu Yuan
Dong Chen
B. Guo
    ViT
ArXiv (abs)PDFHTMLGithub (569★)

Papers citing "CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows"

50 / 440 papers shown
Title
Enhancing Efficiency in Vision Transformer Networks: Design Techniques
  and Insights
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
Amirhossein Kazerouni
Ilker Hacihaliloglu
Dorit Merhof
99
7
0
28 Mar 2024
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
Donghyun Kim
Byeongho Heo
Dongyoon Han
87
17
0
28 Mar 2024
ViTAR: Vision Transformer with Any Resolution
ViTAR: Vision Transformer with Any Resolution
Qihang Fan
Quanzeng You
Xiaotian Han
Yongfei Liu
Yunzhe Tao
Huaibo Huang
Ran He
Hongxia Yang
ViT
92
16
0
27 Mar 2024
Heracles: A Hybrid SSM-Transformer Model for High-Resolution Image and
  Time-Series Analysis
Heracles: A Hybrid SSM-Transformer Model for High-Resolution Image and Time-Series Analysis
Badri N. Patro
Suhas Ranganath
Vinay P. Namboodiri
Vijay Srinivas Agneeswaran
99
3
0
26 Mar 2024
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
Chenhongyi Yang
Zehui Chen
Miguel Espinosa
Linus Ericsson
Zhenyu Wang
Jiaming Liu
Elliot J. Crowley
Mamba
123
99
0
26 Mar 2024
CurbNet: Curb Detection Framework Based on LiDAR Point Cloud Segmentation
CurbNet: Curb Detection Framework Based on LiDAR Point Cloud Segmentation
Guoyang Zhao
Fulong Ma
Weiqing Qi
Yuxuan Liu
Ming-Yuan Liu
Jun Ma
99
5
0
25 Mar 2024
PaPr: Training-Free One-Step Patch Pruning with Lightweight ConvNets for
  Faster Inference
PaPr: Training-Free One-Step Patch Pruning with Lightweight ConvNets for Faster Inference
Tanvir Mahmud
Burhaneddin Yaman
Chun-Hao Liu
Diana Marculescu
125
3
0
24 Mar 2024
SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate
  Time series
SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series
Badri N. Patro
Vijay Srinivas Agneeswaran
Mamba
121
57
0
22 Mar 2024
ParFormer: Vision Transformer Baseline with Parallel Local Global Token
  Mixer and Convolution Attention Patch Embedding
ParFormer: Vision Transformer Baseline with Parallel Local Global Token Mixer and Convolution Attention Patch Embedding
Novendra Setyawan
Ghufron Wahyu Kurniawan
Chi-Chia Sun
Jun-Wei Hsieh
Hui-Kai Su
W. Kuo
ViTMoE
94
0
0
22 Mar 2024
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
Ting Yao
Yehao Li
Yingwei Pan
Tao Mei
ViT
85
21
0
18 Mar 2024
TCNet: Continuous Sign Language Recognition from Trajectories and
  Correlated Regions
TCNet: Continuous Sign Language Recognition from Trajectories and Correlated Regions
Hui Lu
A. A. Salah
Ronald Poppe
SLR
73
6
0
18 Mar 2024
Neural Markov Random Field for Stereo Matching
Neural Markov Random Field for Stereo Matching
Tongfan Guan
Chen Wang
Yunchun Liu
3DV
70
26
0
17 Mar 2024
Multi-criteria Token Fusion with One-step-ahead Attention for Efficient
  Vision Transformers
Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision Transformers
Sanghyeok Lee
Joonmyung Choi
Hyunwoo J. Kim
ViT
82
10
0
15 Mar 2024
DITTO: Dual and Integrated Latent Topologies for Implicit 3D
  Reconstruction
DITTO: Dual and Integrated Latent Topologies for Implicit 3D Reconstruction
Jaehyeok Shim
Kyungdon Joo
3DPC3DV
111
1
0
08 Mar 2024
HyenaPixel: Global Image Context with Convolutions
HyenaPixel: Global Image Context with Convolutions
Julian Spravil
Sebastian Houben
Sven Behnke
60
1
0
29 Feb 2024
Interactive Multi-Head Self-Attention with Linear Complexity
Interactive Multi-Head Self-Attention with Linear Complexity
Hankyul Kang
Ming-Hsuan Yang
Jongbin Ryu
55
1
0
27 Feb 2024
Multi-Human Mesh Recovery with Transformers
Multi-Human Mesh Recovery with Transformers
Zeyu Wang
Zhenzhen Weng
Serena Yeung-Levy
3DH
55
1
0
26 Feb 2024
Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data
Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data
Shufan Li
Harkanwar Singh
Aditya Grover
Mamba
181
64
0
08 Feb 2024
Memory Consolidation Enables Long-Context Video Understanding
Memory Consolidation Enables Long-Context Video Understanding
Ivana Balavzević
Yuge Shi
Pinelopi Papalampidi
Rahma Chaabouni
Skanda Koppula
Olivier J. Hénaff
195
27
0
08 Feb 2024
A Survey on Transformer Compression
A Survey on Transformer Compression
Yehui Tang
Yunhe Wang
Jianyuan Guo
Zhijun Tu
Kai Han
Hailin Hu
Dacheng Tao
156
35
0
05 Feb 2024
TCI-Former: Thermal Conduction-Inspired Transformer for Infrared Small
  Target Detection
TCI-Former: Thermal Conduction-Inspired Transformer for Infrared Small Target Detection
Tianxiang Chen
Zhentao Tan
Qi Chu
Yue-bo Wu
Bin Liu
Nenghai Yu
120
16
0
03 Feb 2024
LIR: A Lightweight Baseline for Image Restoration
LIR: A Lightweight Baseline for Image Restoration
Dongqi Fan
Ting Yue
Xin Zhao
Renjing Xu
Liang Chang
85
0
0
02 Feb 2024
Generating Multi-Center Classifier via Conditional Gaussian Distribution
Generating Multi-Center Classifier via Conditional Gaussian Distribution
Zhemin Zhang
Xun Gong
65
0
0
29 Jan 2024
Self-supervised Video Object Segmentation with Distillation Learning of
  Deformable Attention
Self-supervised Video Object Segmentation with Distillation Learning of Deformable Attention
Quang-Trung Truong
Duc Thanh Nguyen
Binh-Son Hua
Sai-Kit Yeung
VOS
65
2
0
25 Jan 2024
VIPTR: A Vision Permutable Extractor for Fast and Efficient Scene Text
  Recognition
VIPTR: A Vision Permutable Extractor for Fast and Efficient Scene Text Recognition
Xianfu Cheng
Weixiao Zhou
Xiang Li
Xiaoming Chen
Jian Yang
Tongliang Li
Zhoujun Li
112
3
0
18 Jan 2024
GPT4Ego: Unleashing the Potential of Pre-trained Models for Zero-Shot
  Egocentric Action Recognition
GPT4Ego: Unleashing the Potential of Pre-trained Models for Zero-Shot Egocentric Action Recognition
Guangzhao Dai
Xiangbo Shu
Wenhao Wu
Rui Yan
Jiachao Zhang
VLM
115
7
0
18 Jan 2024
SymTC: A Symbiotic Transformer-CNN Net for Instance Segmentation of
  Lumbar Spine MRI
SymTC: A Symbiotic Transformer-CNN Net for Instance Segmentation of Lumbar Spine MRI
Jiasong Chen
Linchen Qian
Linhai Ma
Timur Urakov
Weiyong Gu
Liang Liang
MedIm
81
8
0
17 Jan 2024
Vision Mamba: Efficient Visual Representation Learning with
  Bidirectional State Space Model
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Lianghui Zhu
Bencheng Liao
Qian Zhang
Xinlong Wang
Wenyu Liu
Xinggang Wang
Mamba
125
817
0
17 Jan 2024
Fully Attentional Networks with Self-emerging Token Labeling
Fully Attentional Networks with Self-emerging Token Labeling
Bingyin Zhao
Zhiding Yu
Shiyi Lan
Yutao Cheng
A. Anandkumar
Yingjie Lao
Jose M. Alvarez
1.0K
6
0
08 Jan 2024
SeTformer is What You Need for Vision and Language
SeTformer is What You Need for Vision and Language
Pourya Shamsolmoali
Masoumeh Zareapoor
Eric Granger
Michael Felsberg
76
5
0
07 Jan 2024
BRAU-Net++: U-Shaped Hybrid CNN-Transformer Network for Medical Image
  Segmentation
BRAU-Net++: U-Shaped Hybrid CNN-Transformer Network for Medical Image Segmentation
Libin Lan
Pengzhou Cai
Lu Jiang
Xiaojuan Liu
Yongmei Li
Yudong Zhang
ViTMedIm
79
10
0
01 Jan 2024
PanGu-$π$: Enhancing Language Model Architectures via Nonlinearity
  Compensation
PanGu-πππ: Enhancing Language Model Architectures via Nonlinearity Compensation
Yunhe Wang
Hanting Chen
Yehui Tang
Tianyu Guo
Kai Han
...
Qinghua Xu
Qun Liu
Jun Yao
Chao Xu
Dacheng Tao
128
20
0
27 Dec 2023
Deformable Audio Transformer for Audio Event Detection
Deformable Audio Transformer for Audio Event Detection
Wentao Zhu
78
0
0
24 Dec 2023
ConDaFormer: Disassembled Transformer with Local Structure Enhancement
  for 3D Point Cloud Understanding
ConDaFormer: Disassembled Transformer with Local Structure Enhancement for 3D Point Cloud Understanding
Lunhao Duan
Shanshan Zhao
Nan Xue
Biwei Huang
Gui-Song Xia
Dacheng Tao
ViT
132
20
0
18 Dec 2023
Agent Attention: On the Integration of Softmax and Linear Attention
Agent Attention: On the Integration of Softmax and Linear Attention
Dongchen Han
Tianzhu Ye
Yizeng Han
Zhuofan Xia
Siyuan Pan
Pengfei Wan
Shiji Song
Gao Huang
103
88
0
14 Dec 2023
Auto-Prox: Training-Free Vision Transformer Architecture Search via
  Automatic Proxy Discovery
Auto-Prox: Training-Free Vision Transformer Architecture Search via Automatic Proxy Discovery
Zimian Wei
Lujun Li
Peijie Dong
Zheng Hui
Anggeng Li
Menglong Lu
H. Pan
Zhiliang Tian
Dongsheng Li
ViT
73
17
0
14 Dec 2023
MaskConver: Revisiting Pure Convolution Model for Panoptic Segmentation
MaskConver: Revisiting Pure Convolution Model for Panoptic Segmentation
Abdullah Rashwan
Jiageng Zhang
A. Taalimi
Fan Yang
Xingyi Zhou
Chaochao Yan
Liang-Chieh Chen
Yeqing Li
ViT
117
5
0
11 Dec 2023
Transformer-based Selective Super-Resolution for Efficient Image
  Refinement
Transformer-based Selective Super-Resolution for Efficient Image Refinement
Tianyi Zhang
Kishore Kasichainula
Yaoxin Zhuo
Baoxin Li
Jae-sun Seo
Yu Cao
48
7
0
10 Dec 2023
The Counterattack of CNNs in Self-Supervised Learning: Larger Kernel
  Size might be All You Need
The Counterattack of CNNs in Self-Supervised Learning: Larger Kernel Size might be All You Need
Tianjin Huang
Tianlong Chen
Zhangyang Wang
Shiwei Liu
80
1
0
09 Dec 2023
SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference
SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference
Feng Wang
Jieru Mei
Alan Yuille
VLM
146
66
0
04 Dec 2023
Token Fusion: Bridging the Gap between Token Pruning and Token Merging
Token Fusion: Bridging the Gap between Token Pruning and Token Merging
Minchul Kim
Shangqian Gao
Yen-Chang Hsu
Yilin Shen
Hongxia Jin
98
42
0
02 Dec 2023
SCHEME: Scalable Channel Mixer for Vision Transformers
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
170
0
0
01 Dec 2023
GeoDeformer: Geometric Deformable Transformer for Action Recognition
GeoDeformer: Geometric Deformable Transformer for Action Recognition
Jinhui Ye
Jiaming Zhou
Hui Xiong
Junwei Liang
ViT
48
1
0
29 Nov 2023
PEAN: A Diffusion-Based Prior-Enhanced Attention Network for Scene Text
  Image Super-Resolution
PEAN: A Diffusion-Based Prior-Enhanced Attention Network for Scene Text Image Super-Resolution
Zuoyan Zhao
Hui Xue
Pengfei Fang
Shipeng Zhu
DiffM
64
4
0
29 Nov 2023
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
Dai Shi
ViT
93
98
0
28 Nov 2023
Cross-level Attention with Overlapped Windows for Camouflaged Object
  Detection
Cross-level Attention with Overlapped Windows for Camouflaged Object Detection
Jiepan Li
Fangxiao Lu
Nan Xue
Zhuo Li
Hongyan Zhang
Wei He
88
2
0
28 Nov 2023
Advancing Vision Transformers with Group-Mix Attention
Advancing Vision Transformers with Group-Mix Attention
Chongjian Ge
Xiaohan Ding
Zhan Tong
Lichao Sun
Jiangliu Wang
Yibing Song
Ping Luo
180
18
0
26 Nov 2023
Bitformer: An efficient Transformer with bitwise operation-based
  attention for Big Data Analytics at low-cost low-precision devices
Bitformer: An efficient Transformer with bitwise operation-based attention for Big Data Analytics at low-cost low-precision devices
Gaoxiang Duan
Junkai Zhang
Xiaoying Zheng
Yongxin Zhu
63
2
0
22 Nov 2023
Deep Tensor Network
Deep Tensor Network
Yifan Zhang
120
0
0
18 Nov 2023
Vision Big Bird: Random Sparsification for Full Attention
Vision Big Bird: Random Sparsification for Full Attention
Zhemin Zhang
Xun Gong
ViT
70
1
0
10 Nov 2023
Previous
123456789
Next