ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.12723
  4. Cited By
Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and
  Interpretable Visual Understanding

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

26 May 2021
Zizhao Zhang
Han Zhang
Long Zhao
Ting Chen
Sercan Ö. Arik
Tomas Pfister
    ViT
ArXivPDFHTML

Papers citing "Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding"

43 / 43 papers shown
Title
Buffer-based Gradient Projection for Continual Federated Learning
Buffer-based Gradient Projection for Continual Federated Learning
Shenghong Dai
Jy-yong Sohn
Yicong Chen
S. Alam
Ravikumar Balakrishnan
Suman Banerjee
N. Himayat
Kangwook Lee
FedML
75
2
0
03 Sep 2024
Federated Class-Incremental Learning with Prompting
Federated Class-Incremental Learning with Prompting
Jiale Liu
Yu-Wei Zhan
Chong-Yu Zhang
Xin Luo
Zhen-Duo Chen
Yinwei Wei
CLL
FedML
29
2
0
13 Oct 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
34
4
0
10 Oct 2023
Distributionally Robust Classification on a Data Budget
Distributionally Robust Classification on a Data Budget
Ben Feuer
Ameya Joshi
Minh Pham
C. Hegde
OOD
37
2
0
07 Aug 2023
FIT: Far-reaching Interleaved Transformers
FIT: Far-reaching Interleaved Transformers
Ting-Li Chen
Lala Li
29
12
0
22 May 2023
CageViT: Convolutional Activation Guided Efficient Vision Transformer
CageViT: Convolutional Activation Guided Efficient Vision Transformer
Hao Zheng
Jinbao Wang
Xiantong Zhen
Hao Chen
Jingkuan Song
Feng Zheng
ViT
20
0
0
17 May 2023
DeDA: Deep Directed Accumulator
DeDA: Deep Directed Accumulator
Hang Zhang
Rongguang Wang
Renjiu Hu
Jinwei Zhang
Jiahao Nick Li
MedIm
27
4
0
15 Mar 2023
Out of Distribution Performance of State of Art Vision Model
Out of Distribution Performance of State of Art Vision Model
Salman Rahman
W. Lee
40
2
0
25 Jan 2023
Semi-Structured Object Sequence Encoders
Semi-Structured Object Sequence Encoders
V. Rudramurthy
Riyaz Ahmad Bhat
Chulaka Gunasekara
Siva Sankalp Patel
H. Wan
Tejas I. Dhamecha
Danish Contractor
Marina Danilevsky
61
0
0
03 Jan 2023
Exploring Vision Transformers as Diffusion Learners
Exploring Vision Transformers as Diffusion Learners
He Cao
Jianan Wang
Tianhe Ren
Xianbiao Qi
Yihao Chen
Yuan Yao
Lefei Zhang
44
10
0
28 Dec 2022
Rethinking Vision Transformers for MobileNet Size and Speed
Rethinking Vision Transformers for MobileNet Size and Speed
Yanyu Li
Ju Hu
Yang Wen
Georgios Evangelidis
Kamyar Salahi
Yanzhi Wang
Sergey Tulyakov
Jian Ren
ViT
35
159
0
15 Dec 2022
Comparing the Decision-Making Mechanisms by Transformers and CNNs via
  Explanation Methods
Comparing the Decision-Making Mechanisms by Transformers and CNNs via Explanation Methods
Ming-Xiu Jiang
Saeed Khorram
Li Fuxin
FAtt
22
9
0
13 Dec 2022
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group
  Propagation
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
Chenhongyi Yang
Jiarui Xu
Shalini De Mello
Elliot J. Crowley
Xinyu Wang
ViT
38
21
0
13 Dec 2022
Position Embedding Needs an Independent Layer Normalization
Position Embedding Needs an Independent Layer Normalization
Runyi Yu
Zhennan Wang
Yinhuai Wang
Kehan Li
Yian Zhao
Jian Zhang
Guoli Song
Jie Chen
31
1
0
10 Dec 2022
VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video
  Paragraph Captioning
VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning
Kashu Yamazaki
Khoa T. Vo
Sang Truong
Bhiksha Raj
Ngan Le
29
35
0
28 Nov 2022
Degenerate Swin to Win: Plain Window-based Transformer without
  Sophisticated Operations
Degenerate Swin to Win: Plain Window-based Transformer without Sophisticated Operations
Tan Yu
Ping Li
ViT
46
5
0
25 Nov 2022
NAR-Former: Neural Architecture Representation Learning towards Holistic
  Attributes Prediction
NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction
Yun Yi
Haokui Zhang
Wenze Hu
Nannan Wang
Xiaoyu Wang
AI4TS
AI4CE
32
8
0
15 Nov 2022
BiViT: Extremely Compressed Binary Vision Transformer
BiViT: Extremely Compressed Binary Vision Transformer
Yefei He
Zhenyu Lou
Luoming Zhang
Jing Liu
Weijia Wu
Hong Zhou
Bohan Zhuang
ViT
MQ
20
28
0
14 Nov 2022
ViT-CX: Causal Explanation of Vision Transformers
ViT-CX: Causal Explanation of Vision Transformers
Weiyan Xie
Xiao-hui Li
Caleb Chen Cao
Nevin L.Zhang
ViT
29
17
0
06 Nov 2022
Explicitly Increasing Input Information Density for Vision Transformers
  on Small Datasets
Explicitly Increasing Input Information Density for Vision Transformers on Small Datasets
Xiangyu Chen
Ying Qin
Wenju Xu
A. Bur
Cuncong Zhong
Guanghui Wang
ViT
46
3
0
25 Oct 2022
TokenMixup: Efficient Attention-guided Token-level Data Augmentation for
  Transformers
TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers
Hyeong Kyu Choi
Joonmyung Choi
Hyunwoo J. Kim
ViT
31
35
0
14 Oct 2022
A Generalist Framework for Panoptic Segmentation of Images and Videos
A Generalist Framework for Panoptic Segmentation of Images and Videos
Ting-Li Chen
Lala Li
Saurabh Saxena
Geoffrey E. Hinton
David J. Fleet
VGen
MLLM
43
102
0
12 Oct 2022
Bridging the Gap Between Vision Transformers and Convolutional Neural
  Networks on Small Datasets
Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets
Zhiying Lu
Hongtao Xie
Chuanbin Liu
Yongdong Zhang
ViT
28
57
0
12 Oct 2022
The Lie Derivative for Measuring Learned Equivariance
The Lie Derivative for Measuring Learned Equivariance
Nate Gruver
Marc Finzi
Micah Goldblum
A. Wilson
18
34
0
06 Oct 2022
Axially Expanded Windows for Local-Global Interaction in Vision
  Transformers
Axially Expanded Windows for Local-Global Interaction in Vision Transformers
Zhemin Zhang
Xun Gong
ViT
18
1
0
19 Sep 2022
Locality Guidance for Improving Vision Transformers on Tiny Datasets
Locality Guidance for Improving Vision Transformers on Tiny Datasets
Kehan Li
Runyi Yu
Zhennan Wang
Li-ming Yuan
Guoli Song
Jie Chen
ViT
32
44
0
20 Jul 2022
EfficientFormer: Vision Transformers at MobileNet Speed
EfficientFormer: Vision Transformers at MobileNet Speed
Yanyu Li
Geng Yuan
Yang Wen
Eric Hu
Georgios Evangelidis
Sergey Tulyakov
Yanzhi Wang
Jian Ren
ViT
23
347
0
02 Jun 2022
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision
  Transformers with Locality
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality
Xiang Li
Wenhai Wang
Lingfeng Yang
Jian Yang
116
73
0
20 May 2022
Deeper Insights into the Robustness of ViTs towards Common Corruptions
Deeper Insights into the Robustness of ViTs towards Common Corruptions
Rui Tian
Zuxuan Wu
Qi Dai
Han Hu
Yu-Gang Jiang
ViT
AAML
21
4
0
26 Apr 2022
Searching Intrinsic Dimensions of Vision Transformers
Searching Intrinsic Dimensions of Vision Transformers
Fanghui Xue
Biao Yang
Y. Qi
Jack Xin
ViT
38
2
0
16 Apr 2022
DualPrompt: Complementary Prompting for Rehearsal-free Continual
  Learning
DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning
Zifeng Wang
Zizhao Zhang
Sayna Ebrahimi
Ruoxi Sun
Han Zhang
...
Xiaoqi Ren
Guolong Su
Vincent Perot
Jennifer Dy
Tomas Pfister
CLL
VLM
VPVLM
36
460
0
10 Apr 2022
Characterizing Renal Structures with 3D Block Aggregate Transformers
Characterizing Renal Structures with 3D Block Aggregate Transformers
Xin Yu
Yucheng Tang
Yinchi Zhou
Riqiang Gao
Qi Yang
...
Yuankai Huo
Zhoubing Xu
Thomas A. Lasko
R. Abramson
Bennett A. Landman
MedIm
ViT
32
3
0
04 Mar 2022
BOAT: Bilateral Local Attention Vision Transformer
BOAT: Bilateral Local Attention Vision Transformer
Tan Yu
Gangming Zhao
Ping Li
Yizhou Yu
ViT
33
27
0
31 Jan 2022
Aggregating Global Features into Local Vision Transformer
Aggregating Global Features into Local Vision Transformer
Krushi Patel
A. Bur
Fengju Li
Guanghui Wang
ViT
33
34
0
30 Jan 2022
Learning to Prompt for Continual Learning
Learning to Prompt for Continual Learning
Zifeng Wang
Zizhao Zhang
Chen-Yu Lee
Han Zhang
Ruoxi Sun
Xiaoqi Ren
Guolong Su
Vincent Perot
Jennifer Dy
Tomas Pfister
CLL
VPVLM
KELM
VLM
22
738
0
16 Dec 2021
Visformer: The Vision-friendly Transformer
Visformer: The Vision-friendly Transformer
Zhengsu Chen
Lingxi Xie
Jianwei Niu
Xuefeng Liu
Longhui Wei
Qi Tian
ViT
120
209
0
26 Apr 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw
  Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Huayu Chen
Boqing Gong
ViT
251
577
0
22 Apr 2021
Transformer in Transformer
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
289
1,524
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
307
3,625
0
24 Feb 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
281
179
0
17 Feb 2021
Video Transformer Network
Video Transformer Network
Daniel Neimark
Omri Bar
Maya Zohar
Dotan Asselmann
ViT
204
422
0
01 Feb 2021
Bottleneck Transformers for Visual Recognition
Bottleneck Transformers for Visual Recognition
A. Srinivas
Nayeon Lee
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
290
980
0
27 Jan 2021
Real-Time Single Image and Video Super-Resolution Using an Efficient
  Sub-Pixel Convolutional Neural Network
Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network
Wenzhe Shi
Jose Caballero
Ferenc Huszár
J. Totz
Andrew P. Aitken
Rob Bishop
Daniel Rueckert
Zehan Wang
SupR
198
5,176
0
16 Sep 2016
1