ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.14222
  4. Cited By
Rethinking and Improving Relative Position Encoding for Vision
  Transformer

Rethinking and Improving Relative Position Encoding for Vision Transformer

29 July 2021
Kan Wu
Houwen Peng
Minghao Chen
Jianlong Fu
Hongyang Chao
    ViT
ArXivPDFHTML

Papers citing "Rethinking and Improving Relative Position Encoding for Vision Transformer"

50 / 164 papers shown
Title
Position Embedding Needs an Independent Layer Normalization
Position Embedding Needs an Independent Layer Normalization
Runyi Yu
Zhennan Wang
Yinhuai Wang
Kehan Li
Yian Zhao
Jian Zhang
Guoli Song
Jie Chen
31
1
0
10 Dec 2022
Gaussian Radar Transformer for Semantic Segmentation in Noisy Radar Data
Gaussian Radar Transformer for Semantic Segmentation in Noisy Radar Data
Matthias Zeller
Jens Behley
Michael Heidingsfeld
C. Stachniss
32
23
0
07 Dec 2022
Relation-Aware Language-Graph Transformer for Question Answering
Relation-Aware Language-Graph Transformer for Question Answering
Jinyoung Park
Hyeong Kyu Choi
Juyeon Ko
Hyeon-ju Park
Ji-Hoon Kim
Jisu Jeong
Kyungmin Kim
Hyunwoo J. Kim
KELM
LMTD
ViT
15
10
0
02 Dec 2022
ResFormer: Scaling ViTs with Multi-Resolution Training
ResFormer: Scaling ViTs with Multi-Resolution Training
Rui Tian
Zuxuan Wu
Qiuju Dai
Hang-Rui Hu
Yu Qiao
Yu-Gang Jiang
ViT
21
33
0
01 Dec 2022
AirFormer: Predicting Nationwide Air Quality in China with Transformers
AirFormer: Predicting Nationwide Air Quality in China with Transformers
Keli Zhang
Yutong Xia
Songyu Ke
Yiwei Wang
Qingsong Wen
Junbo Zhang
Yu Zheng
R. Zimmermann
AI4TS
AI4CE
24
106
0
29 Nov 2022
Beyond Ensemble Averages: Leveraging Climate Model Ensembles for
  Subseasonal Forecasting
Beyond Ensemble Averages: Leveraging Climate Model Ensembles for Subseasonal Forecasting
Elena Orlova
Haokun Liu
Raphael Rossellini
B. Cash
Rebecca Willett
26
3
0
29 Nov 2022
Meta Architecture for Point Cloud Analysis
Meta Architecture for Point Cloud Analysis
Haojia Lin
Xiawu Zheng
Lijiang Li
Rongrong Ji
Sha Wang
Yan Wang
Yonghong Tian
Rongrong Ji
3DPC
33
45
0
26 Nov 2022
Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
51
75
0
17 Nov 2022
Hypergraph Transformer for Skeleton-based Action Recognition
Hypergraph Transformer for Skeleton-based Action Recognition
Yuxuan Zhou
Zhi-Qi Cheng
Chong Li
Yanwen Fang
Yifeng Geng
Xuansong Xie
M. Keuper
ViT
32
52
0
17 Nov 2022
Parameter-Efficient Transformer with Hybrid Axial-Attention for Medical
  Image Segmentation
Parameter-Efficient Transformer with Hybrid Axial-Attention for Medical Image Segmentation
Yiyue Hu
Lei Zhang
Nan Mu
Leijun Liu
ViT
MedIm
22
1
0
17 Nov 2022
MogaNet: Multi-order Gated Aggregation Network
MogaNet: Multi-order Gated Aggregation Network
Siyuan Li
Zedong Wang
Zicheng Liu
Cheng Tan
Haitao Lin
Di Wu
Zhiyuan Chen
Jiangbin Zheng
Stan Z. Li
26
55
0
07 Nov 2022
Data Level Lottery Ticket Hypothesis for Vision Transformers
Data Level Lottery Ticket Hypothesis for Vision Transformers
Xuan Shen
Zhenglun Kong
Minghai Qin
Peiyan Dong
Geng Yuan
Xin Meng
Hao Tang
Xiaolong Ma
Yanzhi Wang
30
6
0
02 Nov 2022
Adversarial Pretraining of Self-Supervised Deep Networks: Past, Present
  and Future
Adversarial Pretraining of Self-Supervised Deep Networks: Past, Present and Future
Guo-Jun Qi
M. Shah
SSL
23
8
0
23 Oct 2022
Sequence and Circle: Exploring the Relationship Between Patches
Sequence and Circle: Exploring the Relationship Between Patches
Zhengyang Yu
Jochen Triesch
ViT
31
0
0
18 Oct 2022
Dense-TNT: Efficient Vehicle Type Classification Neural Network Using
  Satellite Imagery
Dense-TNT: Efficient Vehicle Type Classification Neural Network Using Satellite Imagery
Ruikang Luo
Yaofeng Song
Haiying Zhao
Yicheng Zhang
Yi Zhang
Nanbin Zhao
Liping Huang
Rong Su
ViT
16
11
0
27 Sep 2022
PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for
  Vision Transformers
PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for Vision Transformers
Zhikai Li
Mengjuan Chen
Junrui Xiao
Qingyi Gu
ViT
MQ
43
33
0
13 Sep 2022
FocusFormer: Focusing on What We Need via Architecture Sampler
FocusFormer: Focusing on What We Need via Architecture Sampler
Jing Liu
Jianfei Cai
Bohan Zhuang
35
7
0
23 Aug 2022
SoMoFormer: Social-Aware Motion Transformer for Multi-Person Motion
  Prediction
SoMoFormer: Social-Aware Motion Transformer for Multi-Person Motion Prediction
Xiaogang Peng
Yaodi Shen
Haoran Wang
Binling Nie
Yigang Wang
Zizhao Wu
ViT
25
6
0
19 Aug 2022
giMLPs: Gate with Inhibition Mechanism in MLPs
Cheng Kang
Jindich Prokop
Lei Tong
Huiyu Zhou
Yong Hu
Daneil Novak
29
0
0
01 Aug 2022
TinyViT: Fast Pretraining Distillation for Small Vision Transformers
TinyViT: Fast Pretraining Distillation for Small Vision Transformers
Kan Wu
Jinnian Zhang
Houwen Peng
Mengchen Liu
Bin Xiao
Jianlong Fu
Lu Yuan
ViT
21
246
0
21 Jul 2022
Parameterization of Cross-Token Relations with Relative Positional
  Encoding for Vision MLP
Parameterization of Cross-Token Relations with Relative Positional Encoding for Vision MLP
Zhicai Wang
Y. Hao
Xingyu Gao
Hao Zhang
Shuo Wang
Tingting Mu
Xiangnan He
21
8
0
15 Jul 2022
I-ViT: Integer-only Quantization for Efficient Vision Transformer
  Inference
I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
Zhikai Li
Qingyi Gu
MQ
57
95
0
04 Jul 2022
LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs
LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs
Yukang Chen
Jianhui Liu
Xinming Zhang
Xiaojuan Qi
Jiaya Jia
53
85
0
21 Jun 2022
Online Segmentation of LiDAR Sequences: Dataset and Algorithm
Online Segmentation of LiDAR Sequences: Dataset and Algorithm
Romain Loiseau
Mathieu Aubry
Loïc Landrieu
3DPC
24
15
0
16 Jun 2022
SP-ViT: Learning 2D Spatial Priors for Vision Transformers
SP-ViT: Learning 2D Spatial Priors for Vision Transformers
Yuxuan Zhou
Wangmeng Xiang
Chong Li
Biao Wang
Xihan Wei
Lei Zhang
M. Keuper
Xia Hua
ViT
34
15
0
15 Jun 2022
Peripheral Vision Transformer
Peripheral Vision Transformer
Juhong Min
Yucheng Zhao
Chong Luo
Minsu Cho
ViT
MDE
32
30
0
14 Jun 2022
Positional Label for Self-Supervised Vision Transformer
Positional Label for Self-Supervised Vision Transformer
Zhemin Zhang
Xun Gong
ViT
MDE
25
6
0
10 Jun 2022
Transforming medical imaging with Transformers? A comparative review of
  key properties, current progresses, and future perspectives
Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives
Jun Li
Junyu Chen
Yucheng Tang
Ce Wang
Bennett A. Landman
S. K. Zhou
ViT
OOD
MedIm
23
21
0
02 Jun 2022
Modeling Image Composition for Complex Scene Generation
Modeling Image Composition for Complex Scene Generation
Zuopeng Yang
Daqing Liu
Chaoyue Wang
J. Yang
Dacheng Tao
ViT
36
50
0
02 Jun 2022
Vision GNN: An Image is Worth Graph of Nodes
Vision GNN: An Image is Worth Graph of Nodes
Kai Han
Yunhe Wang
Jianyuan Guo
Yehui Tang
Enhua Wu
GNN
3DH
17
352
0
01 Jun 2022
Efficient Self-supervised Vision Pretraining with Local Masked Reconstruction
Efficient Self-supervised Vision Pretraining with Local Masked Reconstruction
Jun Chen
Ming Hu
Boyang Albert Li
Mohamed Elhoseiny
47
36
0
01 Jun 2022
Flexible Diffusion Modeling of Long Videos
Flexible Diffusion Modeling of Long Videos
William Harvey
Saeid Naderiparizi
Vaden Masrani
Christian D. Weilbach
Frank Wood
DiffM
BDL
VGen
176
285
0
23 May 2022
KERPLE: Kernelized Relative Positional Embedding for Length
  Extrapolation
KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Ta-Chung Chi
Ting-Han Fan
Peter J. Ramadge
Alexander I. Rudnicky
47
65
0
20 May 2022
BabyNet: Residual Transformer Module for Birth Weight Prediction on
  Fetal Ultrasound Video
BabyNet: Residual Transformer Module for Birth Weight Prediction on Fetal Ultrasound Video
Szymon Płotka
Michal K. Grzeszczyk
R. Brawura-Biskupski-Samaha
P. Gutaj
M. Lipa
Tomasz Trzciñski
Arkadiusz Sitek
3DH
MedIm
11
17
0
19 May 2022
MiniViT: Compressing Vision Transformers with Weight Multiplexing
MiniViT: Compressing Vision Transformers with Weight Multiplexing
Jinnian Zhang
Houwen Peng
Kan Wu
Mengchen Liu
Bin Xiao
Jianlong Fu
Lu Yuan
ViT
28
124
0
14 Apr 2022
DaViT: Dual Attention Vision Transformers
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
51
242
0
07 Apr 2022
Stratified Transformer for 3D Point Cloud Segmentation
Stratified Transformer for 3D Point Cloud Segmentation
Xin Lai
Jianhui Liu
Li Jiang
Liwei Wang
Hengshuang Zhao
Shu Liu
Xiaojuan Qi
Jiaya Jia
3DPC
ViT
29
262
0
28 Mar 2022
Visual Abductive Reasoning
Visual Abductive Reasoning
Chen Liang
Wenguan Wang
Tianfei Zhou
Yi Yang
LRM
26
38
0
26 Mar 2022
Transformers Meet Visual Learning Understanding: A Comprehensive Review
Transformers Meet Visual Learning Understanding: A Comprehensive Review
Yuting Yang
Licheng Jiao
Xuantong Liu
F. Liu
Shuyuan Yang
Zhixi Feng
Xu Tang
ViT
MedIm
27
28
0
24 Mar 2022
PETR: Position Embedding Transformation for Multi-View 3D Object
  Detection
PETR: Position Embedding Transformation for Multi-View 3D Object Detection
Yingfei Liu
Tiancai Wang
Xinming Zhang
Jian Sun
3DPC
43
526
0
10 Mar 2022
Patch Similarity Aware Data-Free Quantization for Vision Transformers
Patch Similarity Aware Data-Free Quantization for Vision Transformers
Zhikai Li
Liping Ma
Mengjuan Chen
Junrui Xiao
Qingyi Gu
MQ
ViT
19
44
0
04 Mar 2022
Multi-Tailed Vision Transformer for Efficient Inference
Multi-Tailed Vision Transformer for Efficient Inference
Yunke Wang
Bo Du
Wenyuan Wang
Chang Xu
ViT
213
6
0
03 Mar 2022
Recent Advances in Vision Transformer: A Survey and Outlook of Recent
  Work
Recent Advances in Vision Transformer: A Survey and Outlook of Recent Work
Khawar Islam
ViT
28
45
0
03 Mar 2022
A Unified Query-based Paradigm for Point Cloud Understanding
A Unified Query-based Paradigm for Point Cloud Understanding
Zetong Yang
Li Jiang
Yanan Sun
Bernt Schiele
Jiaya Jia
3DPC
25
38
0
02 Mar 2022
Hilbert Flattening: a Locality-Preserving Matrix Unfolding Method for
  Visual Discrimination
Hilbert Flattening: a Locality-Preserving Matrix Unfolding Method for Visual Discrimination
Qingsong Zhao
Shuguang Dou
Zhipeng Zhou
Yangguang Li
Yin Wang
Yu Qiao
Cairong Zhao
22
3
0
21 Feb 2022
Mixing and Shifting: Exploiting Global and Local Dependencies in Vision
  MLPs
Mixing and Shifting: Exploiting Global and Local Dependencies in Vision MLPs
Huangjie Zheng
Pengcheng He
Weizhu Chen
Mingyuan Zhou
22
14
0
14 Feb 2022
VAQF: Fully Automatic Software-Hardware Co-Design Framework for Low-Bit
  Vision Transformer
VAQF: Fully Automatic Software-Hardware Co-Design Framework for Low-Bit Vision Transformer
Mengshu Sun
Haoyu Ma
Guoliang Kang
Yi Ding
Tianlong Chen
Xiaolong Ma
Zhangyang Wang
Yanzhi Wang
ViT
33
45
0
17 Jan 2022
Video Transformers: A Survey
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
22
103
0
16 Jan 2022
Scene-Adaptive Attention Network for Crowd Counting
Scene-Adaptive Attention Network for Crowd Counting
Xing Wei
Yuanrui Kang
Jihao Yang
Yunfeng Qiu
Dahu Shi
Wenming Tan
Yihong Gong
ViT
27
18
0
31 Dec 2021
APRIL: Finding the Achilles' Heel on Privacy for Vision Transformers
APRIL: Finding the Achilles' Heel on Privacy for Vision Transformers
Jiahao Lu
Xi Sheryl Zhang
Tianli Zhao
Xiangyu He
Jian Cheng
ViT
PILM
25
22
0
28 Dec 2021
Previous
1234
Next