ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.03282
  4. Cited By
Visual Dependency Transformers: Dependency Tree Emerges from Reversed
  Attention

Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention

6 April 2023
Mingyu Ding
Songlin Yang
Lijie Fan
Zhenfang Chen
Z. Chen
Ping Luo
J. Tenenbaum
Chuang Gan
    ViT
ArXiv (abs)PDFHTML

Papers citing "Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention"

50 / 54 papers shown
Title
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
Hao Wang
Shuo Zhang
Biao Leng
ViT
260
1
0
03 Apr 2025
Not All Tokens Are Equal: Human-centric Visual Analysis via Token
  Clustering Transformer
Not All Tokens Are Equal: Human-centric Visual Analysis via Token Clustering Transformer
Wang Zeng
Sheng Jin
Wentao Liu
Chao Qian
Ping Luo
Ouyang Wanli
Xiaogang Wang
ViT
75
127
0
19 Apr 2022
Neighborhood Attention Transformer
Neighborhood Attention Transformer
Ali Hassani
Steven Walton
Jiacheng Li
Shengjia Li
Humphrey Shi
ViTAI4TS
92
273
0
14 Apr 2022
DaViT: Dual Attention Vision Transformers
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
127
250
0
07 Apr 2022
Deep Hierarchical Semantic Segmentation
Deep Hierarchical Semantic Segmentation
Liulei Li
Tianfei Zhou
Wenguan Wang
Jianwu Li
Yi Yang
SSeg
567
136
0
27 Mar 2022
Context Autoencoder for Self-Supervised Representation Learning
Context Autoencoder for Self-Supervised Representation Learning
Xiaokang Chen
Mingyu Ding
Xiaodi Wang
Ying Xin
Shentong Mo
Yunhao Wang
Shumin Han
Ping Luo
Gang Zeng
Jingdong Wang
SSL
119
396
0
07 Feb 2022
BOAT: Bilateral Local Attention Vision Transformer
BOAT: Bilateral Local Attention Vision Transformer
Tan Yu
Gangming Zhao
Ping Li
Yizhou Yu
ViT
70
27
0
31 Jan 2022
Patches Are All You Need?
Patches Are All You Need?
Asher Trockman
J. Zico Kolter
ViT
260
410
0
24 Jan 2022
UniFormer: Unifying Convolution and Self-attention for Visual
  Recognition
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li
Yali Wang
Junhao Zhang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
191
379
0
24 Jan 2022
Pale Transformer: A General Vision Transformer Backbone with Pale-Shaped
  Attention
Pale Transformer: A General Vision Transformer Backbone with Pale-Shaped Attention
Sitong Wu
Tianyi Wu
Hao Hao Tan
G. Guo
ViT
58
71
0
28 Dec 2021
MViTv2: Improved Multiscale Vision Transformers for Classification and
  Detection
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
153
693
0
02 Dec 2021
iBOT: Image BERT Pre-Training with Online Tokenizer
iBOT: Image BERT Pre-Training with Online Tokenizer
Jinghao Zhou
Chen Wei
Huiyu Wang
Wei Shen
Cihang Xie
Alan Yuille
Tao Kong
88
740
0
15 Nov 2021
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViTTPM
473
7,819
0
11 Nov 2021
Unsupervised Part Discovery from Contrastive Reconstruction
Unsupervised Part Discovery from Contrastive Reconstruction
Subhabrata Choudhury
Iro Laina
Christian Rupprecht
Andrea Vedaldi
OCLSSL
235
62
0
11 Nov 2021
Rethinking and Improving Relative Position Encoding for Vision
  Transformer
Rethinking and Improving Relative Position Encoding for Vision Transformer
Kan Wu
Houwen Peng
Minghao Chen
Jianlong Fu
Hongyang Chao
ViT
88
338
0
29 Jul 2021
PVT v2: Improved Baselines with Pyramid Vision Transformer
PVT v2: Improved Baselines with Pyramid Vision Transformer
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViTAI4TS
109
1,676
0
25 Jun 2021
Probing Inter-modality: Visual Parsing with Self-Attention for
  Vision-Language Pre-training
Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training
Hongwei Xue
Yupan Huang
Bei Liu
Houwen Peng
Jianlong Fu
Houqiang Li
Jiebo Luo
77
89
0
25 Jun 2021
VOLO: Vision Outlooker for Visual Recognition
VOLO: Vision Outlooker for Visual Recognition
Li-xin Yuan
Qibin Hou
Zihang Jiang
Jiashi Feng
Shuicheng Yan
ViT
116
327
0
24 Jun 2021
XCiT: Cross-Covariance Image Transformers
XCiT: Cross-Covariance Image Transformers
Alaaeldin El-Nouby
Hugo Touvron
Mathilde Caron
Piotr Bojanowski
Matthijs Douze
...
Ivan Laptev
Natalia Neverova
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
ViT
146
513
0
17 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
289
2,841
0
15 Jun 2021
HR-NAS: Searching Efficient High-Resolution Neural Architectures with
  Lightweight Transformers
HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers
Mingyu Ding
Xiaochen Lian
Linjie Yang
Peng Wang
Xiaojie Jin
Zhiwu Lu
Ping Luo
ViT
74
61
0
11 Jun 2021
Scaling Vision with Sparse Mixture of Experts
Scaling Vision with Sparse Mixture of Experts
C. Riquelme
J. Puigcerver
Basil Mustafa
Maxim Neumann
Rodolphe Jenatton
André Susano Pinto
Daniel Keysers
N. Houlsby
MoE
112
609
0
10 Jun 2021
Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer
Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer
Zilong Huang
Youcheng Ben
Guozhong Luo
Pei Cheng
Gang Yu
Bin-Bin Fu
ViT
79
183
0
07 Jun 2021
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
ViT
154
338
0
07 Jun 2021
Glance-and-Gaze Vision Transformer
Glance-and-Gaze Vision Transformer
Qihang Yu
Yingda Xia
Yutong Bai
Yongyi Lu
Alan Yuille
Wei Shen
ViT
65
76
0
04 Jun 2021
DynamicViT: Efficient Vision Transformers with Dynamic Token
  Sparsification
DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
Yongming Rao
Wenliang Zhao
Benlin Liu
Jiwen Lu
Jie Zhou
Cho-Jui Hsieh
ViT
98
707
0
03 Jun 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
708
6,127
0
29 Apr 2021
HOTR: End-to-End Human-Object Interaction Detection with Transformers
HOTR: End-to-End Human-Object Interaction Detection with Transformers
Bumsoo Kim
Junhyun Lee
Jaewoo Kang
Eun-Sol Kim
Hyunwoo J. Kim
ViT
86
256
0
28 Apr 2021
All Tokens Matter: Token Labeling for Training Better Vision
  Transformers
All Tokens Matter: Token Labeling for Training Better Vision Transformers
Zihang Jiang
Qibin Hou
Li-xin Yuan
Daquan Zhou
Yujun Shi
Xiaojie Jin
Anran Wang
Jiashi Feng
ViT
97
209
0
22 Apr 2021
An Empirical Study of Training Self-Supervised Vision Transformers
An Empirical Study of Training Self-Supervised Vision Transformers
Xinlei Chen
Saining Xie
Kaiming He
ViT
159
1,869
0
05 Apr 2021
Rethinking Spatial Dimensions of Vision Transformers
Rethinking Spatial Dimensions of Vision Transformers
Byeongho Heo
Sangdoo Yun
Dongyoon Han
Sanghyuk Chun
Junsuk Choe
Seong Joon Oh
ViT
512
582
0
30 Mar 2021
CvT: Introducing Convolutions to Vision Transformers
CvT: Introducing Convolutions to Vision Transformers
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
154
1,917
0
29 Mar 2021
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image
  Classification
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Chun-Fu Chen
Quanfu Fan
Yikang Shen
ViT
71
1,483
0
27 Mar 2021
Scaling Local Self-Attention for Parameter Efficient Visual Backbones
Scaling Local Self-Attention for Parameter Efficient Visual Backbones
Ashish Vaswani
Prajit Ramachandran
A. Srinivas
Niki Parmar
Blake A. Hechtman
Jonathon Shlens
92
403
0
23 Mar 2021
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely
  Self-supervised Neural Architecture Search
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search
Changlin Li
Tao Tang
Guangrun Wang
Jiefeng Peng
Bing Wang
Xiaodan Liang
Xiaojun Chang
ViT
110
107
0
23 Mar 2021
DeepViT: Towards Deeper Vision Transformer
DeepViT: Towards Deeper Vision Transformer
Daquan Zhou
Bingyi Kang
Xiaojie Jin
Linjie Yang
Xiaochen Lian
Zihang Jiang
Qibin Hou
Jiashi Feng
ViT
102
523
0
22 Mar 2021
Scalable Vision Transformers with Hierarchical Pooling
Scalable Vision Transformers with Hierarchical Pooling
Zizheng Pan
Bohan Zhuang
Jing Liu
Haoyu He
Jianfei Cai
ViT
73
129
0
19 Mar 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
533
3,734
0
24 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
389
2,063
0
09 Feb 2021
Bottleneck Transformers for Visual Recognition
Bottleneck Transformers for Visual Recognition
A. Srinivas
Nayeon Lee
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
360
994
0
27 Jan 2021
Unsupervised Part Discovery by Unsupervised Disentanglement
Unsupervised Part Discovery by Unsupervised Disentanglement
Sandro Braun
Patrick Esser
Bjorn Ommer
OCL
42
4
0
09 Sep 2020
DeepUSPS: Deep Robust Unsupervised Saliency Prediction With
  Self-Supervision
DeepUSPS: Deep Robust Unsupervised Saliency Prediction With Self-Supervision
D. Nguyen
Maximilian Dax
Chaithanya Kumar Mummadi
Thi Phuong Nhung Ngo
T. Nguyen
Zhongyu Lou
Thomas Brox
66
70
0
28 Sep 2019
Stacked Capsule Autoencoders
Stacked Capsule Autoencoders
Adam R. Kosiorek
S. Sabour
Yee Whye Teh
Geoffrey E. Hinton
OCL
51
262
0
17 Jun 2019
SCOPS: Self-Supervised Co-Part Segmentation
SCOPS: Self-Supervised Co-Part Segmentation
Wei-Chih Hung
Varun Jampani
Sifei Liu
Pavlo Molchanov
Ming-Hsuan Yang
Jan Kautz
77
141
0
03 May 2019
MultiGrain: a unified image embedding for classes and instances
MultiGrain: a unified image embedding for classes and instances
Maxim Berman
Hervé Jégou
Andrea Vedaldi
Iasonas Kokkinos
Matthijs Douze
63
111
0
14 Feb 2019
Unsupervised Learning of Syntactic Structure with Invertible Neural
  Projections
Unsupervised Learning of Syntactic Structure with Invertible Neural Projections
Junxian He
Graham Neubig
Taylor Berg-Kirkpatrick
BDL
50
41
0
28 Aug 2018
CapsuleGAN: Generative Adversarial Capsule Network
CapsuleGAN: Generative Adversarial Capsule Network
Ayush Jaiswal
Wael AbdAlmageed
Yue Wu
Premkumar Natarajan
GANMedIm
51
159
0
17 Feb 2018
Dynamic Routing Between Capsules
Dynamic Routing Between Capsules
S. Sabour
Nicholas Frosst
Geoffrey E. Hinton
180
4,604
0
26 Oct 2017
Focal Loss for Dense Object Detection
Focal Loss for Dense Object Detection
Nayeon Lee
Priya Goyal
Ross B. Girshick
Kaiming He
Piotr Dollár
ObjD
127
2,998
0
07 Aug 2017
Mask R-CNN
Mask R-CNN
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
360
27,244
0
20 Mar 2017
12
Next