ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.10882
  4. Cited By
Conditional Positional Encodings for Vision Transformers

Conditional Positional Encodings for Vision Transformers

22 February 2021
Xiangxiang Chu
Zhi Tian
Bo-Wen Zhang
Xinlong Wang
Chunhua Shen
    ViT
ArXivPDFHTML

Papers citing "Conditional Positional Encodings for Vision Transformers"

24 / 124 papers shown
Title
Global Vision Transformer Pruning with Hessian-Aware Saliency
Global Vision Transformer Pruning with Hessian-Aware Saliency
Huanrui Yang
Hongxu Yin
Maying Shen
Pavlo Molchanov
Hai Helen Li
Jan Kautz
ViT
30
39
0
10 Oct 2021
Poformer: A simple pooling transformer for speaker verification
Poformer: A simple pooling transformer for speaker verification
Yufeng Ma
Yiwei Ding
Miao Zhao
Yu Zheng
Min Liu
Minqiang Xu
ViT
21
2
0
10 Oct 2021
UniNet: Unified Architecture Search with Convolution, Transformer, and
  MLP
UniNet: Unified Architecture Search with Convolution, Transformer, and MLP
Jihao Liu
Hongsheng Li
Guanglu Song
Xin Huang
Yu Liu
ViT
37
35
0
08 Oct 2021
Adversarial Robustness Comparison of Vision Transformer and MLP-Mixer to
  CNNs
Adversarial Robustness Comparison of Vision Transformer and MLP-Mixer to CNNs
Philipp Benz
Soomin Ham
Chaoning Zhang
Adil Karjauv
In So Kweon
AAML
ViT
47
78
0
06 Oct 2021
CCTrans: Simplifying and Improving Crowd Counting with Transformer
CCTrans: Simplifying and Improving Crowd Counting with Transformer
Ye Tian
Xiangxiang Chu
Hongpeng Wang
ViT
21
75
0
29 Sep 2021
Hire-MLP: Vision MLP via Hierarchical Rearrangement
Hire-MLP: Vision MLP via Hierarchical Rearrangement
Jianyuan Guo
Yehui Tang
Kai Han
Xinghao Chen
Han Wu
Chao Xu
Chang Xu
Yunhe Wang
46
105
0
30 Aug 2021
A Battle of Network Structures: An Empirical Study of CNN, Transformer,
  and MLP
A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP
Yucheng Zhao
Guangting Wang
Chuanxin Tang
Chong Luo
Wenjun Zeng
Zhengjun Zha
35
69
0
30 Aug 2021
Focal Self-attention for Local-Global Interactions in Vision
  Transformers
Focal Self-attention for Local-Global Interactions in Vision Transformers
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
ViT
42
428
0
01 Jul 2021
Rethinking Token-Mixing MLP for MLP-based Vision Backbone
Rethinking Token-Mixing MLP for MLP-based Vision Backbone
Tan Yu
Xu Li
Yunfeng Cai
Mingming Sun
Ping Li
45
26
0
28 Jun 2021
PVT v2: Improved Baselines with Pyramid Vision Transformer
PVT v2: Improved Baselines with Pyramid Vision Transformer
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
AI4TS
50
1,607
0
25 Jun 2021
P2T: Pyramid Pooling Transformer for Scene Understanding
P2T: Pyramid Pooling Transformer for Scene Understanding
Yu-Huan Wu
Yun-Hai Liu
Xin Zhan
Mingg-Ming Cheng
ViT
29
219
0
22 Jun 2021
A Survey of Transformers
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
50
1,088
0
08 Jun 2021
Fully Transformer Networks for Semantic Image Segmentation
Fully Transformer Networks for Semantic Image Segmentation
Sitong Wu
Tianyi Wu
Fangjian Lin
Sheng Tian
Guodong Guo
ViT
34
39
0
08 Jun 2021
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
ViT
65
329
0
07 Jun 2021
TransMIL: Transformer based Correlated Multiple Instance Learning for
  Whole Slide Image Classification
TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification
Zhucheng Shao
Hao Bian
Yang Chen
Yifeng Wang
Jian Zhang
Xiangyang Ji
Yongbing Zhang
ViT
MedIm
37
635
0
02 Jun 2021
SegFormer: Simple and Efficient Design for Semantic Segmentation with
  Transformers
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Enze Xie
Wenhai Wang
Zhiding Yu
Anima Anandkumar
J. Álvarez
Ping Luo
ViT
44
4,836
0
31 May 2021
Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model
Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model
Jiangning Zhang
Chao Xu
Jian Li
Wenzhou Chen
Yabiao Wang
Ying Tai
Shuo Chen
Chengjie Wang
Feiyue Huang
Yong Liu
32
22
0
31 May 2021
Dual-stream Network for Visual Recognition
Dual-stream Network for Visual Recognition
Mingyuan Mao
Renrui Zhang
Honghui Zheng
Peng Gao
Teli Ma
Yan Peng
Errui Ding
Baochang Zhang
Shumin Han
ViT
25
63
0
31 May 2021
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely
  Self-supervised Neural Architecture Search
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search
Changlin Li
Tao Tang
Guangrun Wang
Jiefeng Peng
Bing Wang
Xiaodan Liang
Xiaojun Chang
ViT
46
105
0
23 Mar 2021
Scalable Vision Transformers with Hierarchical Pooling
Scalable Vision Transformers with Hierarchical Pooling
Zizheng Pan
Bohan Zhuang
Jing Liu
Haoyu He
Jianfei Cai
ViT
27
126
0
19 Mar 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
289
3,623
0
24 Feb 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
278
179
0
17 Feb 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
227
2,430
0
04 Jan 2021
How Much Position Information Do Convolutional Neural Networks Encode?
How Much Position Information Do Convolutional Neural Networks Encode?
Md. Amirul Islam
Sen Jia
Neil D. B. Bruce
SSL
205
344
0
22 Jan 2020
Previous
123