Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.03282
Cited By
Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention
6 April 2023
Mingyu Ding
Songlin Yang
Lijie Fan
Zhenfang Chen
Z. Chen
Ping Luo
J. Tenenbaum
Chuang Gan
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention"
50 / 54 papers shown
Title
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
Hao Wang
Shuo Zhang
Biao Leng
ViT
260
1
0
03 Apr 2025
Not All Tokens Are Equal: Human-centric Visual Analysis via Token Clustering Transformer
Wang Zeng
Sheng Jin
Wentao Liu
Chao Qian
Ping Luo
Ouyang Wanli
Xiaogang Wang
ViT
75
127
0
19 Apr 2022
Neighborhood Attention Transformer
Ali Hassani
Steven Walton
Jiacheng Li
Shengjia Li
Humphrey Shi
ViT
AI4TS
92
273
0
14 Apr 2022
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
127
250
0
07 Apr 2022
Deep Hierarchical Semantic Segmentation
Liulei Li
Tianfei Zhou
Wenguan Wang
Jianwu Li
Yi Yang
SSeg
567
136
0
27 Mar 2022
Context Autoencoder for Self-Supervised Representation Learning
Xiaokang Chen
Mingyu Ding
Xiaodi Wang
Ying Xin
Shentong Mo
Yunhao Wang
Shumin Han
Ping Luo
Gang Zeng
Jingdong Wang
SSL
119
396
0
07 Feb 2022
BOAT: Bilateral Local Attention Vision Transformer
Tan Yu
Gangming Zhao
Ping Li
Yizhou Yu
ViT
70
27
0
31 Jan 2022
Patches Are All You Need?
Asher Trockman
J. Zico Kolter
ViT
260
410
0
24 Jan 2022
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li
Yali Wang
Junhao Zhang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
191
379
0
24 Jan 2022
Pale Transformer: A General Vision Transformer Backbone with Pale-Shaped Attention
Sitong Wu
Tianyi Wu
Hao Hao Tan
G. Guo
ViT
58
71
0
28 Dec 2021
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
153
693
0
02 Dec 2021
iBOT: Image BERT Pre-Training with Online Tokenizer
Jinghao Zhou
Chen Wei
Huiyu Wang
Wei Shen
Cihang Xie
Alan Yuille
Tao Kong
88
740
0
15 Nov 2021
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
473
7,819
0
11 Nov 2021
Unsupervised Part Discovery from Contrastive Reconstruction
Subhabrata Choudhury
Iro Laina
Christian Rupprecht
Andrea Vedaldi
OCL
SSL
235
62
0
11 Nov 2021
Rethinking and Improving Relative Position Encoding for Vision Transformer
Kan Wu
Houwen Peng
Minghao Chen
Jianlong Fu
Hongyang Chao
ViT
88
338
0
29 Jul 2021
PVT v2: Improved Baselines with Pyramid Vision Transformer
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
AI4TS
109
1,676
0
25 Jun 2021
Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training
Hongwei Xue
Yupan Huang
Bei Liu
Houwen Peng
Jianlong Fu
Houqiang Li
Jiebo Luo
77
89
0
25 Jun 2021
VOLO: Vision Outlooker for Visual Recognition
Li-xin Yuan
Qibin Hou
Zihang Jiang
Jiashi Feng
Shuicheng Yan
ViT
116
327
0
24 Jun 2021
XCiT: Cross-Covariance Image Transformers
Alaaeldin El-Nouby
Hugo Touvron
Mathilde Caron
Piotr Bojanowski
Matthijs Douze
...
Ivan Laptev
Natalia Neverova
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
ViT
146
513
0
17 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
289
2,841
0
15 Jun 2021
HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers
Mingyu Ding
Xiaochen Lian
Linjie Yang
Peng Wang
Xiaojie Jin
Zhiwu Lu
Ping Luo
ViT
74
61
0
11 Jun 2021
Scaling Vision with Sparse Mixture of Experts
C. Riquelme
J. Puigcerver
Basil Mustafa
Maxim Neumann
Rodolphe Jenatton
André Susano Pinto
Daniel Keysers
N. Houlsby
MoE
112
609
0
10 Jun 2021
Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer
Zilong Huang
Youcheng Ben
Guozhong Luo
Pei Cheng
Gang Yu
Bin-Bin Fu
ViT
79
183
0
07 Jun 2021
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
ViT
154
338
0
07 Jun 2021
Glance-and-Gaze Vision Transformer
Qihang Yu
Yingda Xia
Yutong Bai
Yongyi Lu
Alan Yuille
Wei Shen
ViT
65
76
0
04 Jun 2021
DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
Yongming Rao
Wenliang Zhao
Benlin Liu
Jiwen Lu
Jie Zhou
Cho-Jui Hsieh
ViT
98
707
0
03 Jun 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
708
6,127
0
29 Apr 2021
HOTR: End-to-End Human-Object Interaction Detection with Transformers
Bumsoo Kim
Junhyun Lee
Jaewoo Kang
Eun-Sol Kim
Hyunwoo J. Kim
ViT
86
256
0
28 Apr 2021
All Tokens Matter: Token Labeling for Training Better Vision Transformers
Zihang Jiang
Qibin Hou
Li-xin Yuan
Daquan Zhou
Yujun Shi
Xiaojie Jin
Anran Wang
Jiashi Feng
ViT
97
209
0
22 Apr 2021
An Empirical Study of Training Self-Supervised Vision Transformers
Xinlei Chen
Saining Xie
Kaiming He
ViT
159
1,869
0
05 Apr 2021
Rethinking Spatial Dimensions of Vision Transformers
Byeongho Heo
Sangdoo Yun
Dongyoon Han
Sanghyuk Chun
Junsuk Choe
Seong Joon Oh
ViT
512
582
0
30 Mar 2021
CvT: Introducing Convolutions to Vision Transformers
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
154
1,917
0
29 Mar 2021
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Chun-Fu Chen
Quanfu Fan
Yikang Shen
ViT
71
1,483
0
27 Mar 2021
Scaling Local Self-Attention for Parameter Efficient Visual Backbones
Ashish Vaswani
Prajit Ramachandran
A. Srinivas
Niki Parmar
Blake A. Hechtman
Jonathon Shlens
92
403
0
23 Mar 2021
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search
Changlin Li
Tao Tang
Guangrun Wang
Jiefeng Peng
Bing Wang
Xiaodan Liang
Xiaojun Chang
ViT
110
107
0
23 Mar 2021
DeepViT: Towards Deeper Vision Transformer
Daquan Zhou
Bingyi Kang
Xiaojie Jin
Linjie Yang
Xiaochen Lian
Zihang Jiang
Qibin Hou
Jiashi Feng
ViT
102
523
0
22 Mar 2021
Scalable Vision Transformers with Hierarchical Pooling
Zizheng Pan
Bohan Zhuang
Jing Liu
Haoyu He
Jianfei Cai
ViT
73
129
0
19 Mar 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
533
3,734
0
24 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
389
2,063
0
09 Feb 2021
Bottleneck Transformers for Visual Recognition
A. Srinivas
Nayeon Lee
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
360
994
0
27 Jan 2021
Unsupervised Part Discovery by Unsupervised Disentanglement
Sandro Braun
Patrick Esser
Bjorn Ommer
OCL
42
4
0
09 Sep 2020
DeepUSPS: Deep Robust Unsupervised Saliency Prediction With Self-Supervision
D. Nguyen
Maximilian Dax
Chaithanya Kumar Mummadi
Thi Phuong Nhung Ngo
T. Nguyen
Zhongyu Lou
Thomas Brox
66
70
0
28 Sep 2019
Stacked Capsule Autoencoders
Adam R. Kosiorek
S. Sabour
Yee Whye Teh
Geoffrey E. Hinton
OCL
51
262
0
17 Jun 2019
SCOPS: Self-Supervised Co-Part Segmentation
Wei-Chih Hung
Varun Jampani
Sifei Liu
Pavlo Molchanov
Ming-Hsuan Yang
Jan Kautz
77
141
0
03 May 2019
MultiGrain: a unified image embedding for classes and instances
Maxim Berman
Hervé Jégou
Andrea Vedaldi
Iasonas Kokkinos
Matthijs Douze
63
111
0
14 Feb 2019
Unsupervised Learning of Syntactic Structure with Invertible Neural Projections
Junxian He
Graham Neubig
Taylor Berg-Kirkpatrick
BDL
50
41
0
28 Aug 2018
CapsuleGAN: Generative Adversarial Capsule Network
Ayush Jaiswal
Wael AbdAlmageed
Yue Wu
Premkumar Natarajan
GAN
MedIm
51
159
0
17 Feb 2018
Dynamic Routing Between Capsules
S. Sabour
Nicholas Frosst
Geoffrey E. Hinton
180
4,604
0
26 Oct 2017
Focal Loss for Dense Object Detection
Nayeon Lee
Priya Goyal
Ross B. Girshick
Kaiming He
Piotr Dollár
ObjD
127
2,998
0
07 Aug 2017
Mask R-CNN
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
360
27,244
0
20 Mar 2017
1
2
Next