ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.00989
  4. Cited By
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

International Conference on Machine Learning (ICML), 2023
1 June 2023
Chaitanya K. Ryali
Yuan-Ting Hu
Daniel Bolya
Chen Wei
Haoqi Fan
Po-Yao (Bernie) Huang
Vaibhav Aggarwal
Arkabandhu Chowdhury
Omid Poursaeed
Judy Hoffman
Jitendra Malik
Yanghao Li
Christoph Feichtenhofer
    3DH
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (985★)

Papers citing "Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles"

50 / 171 papers shown
Title
SatSAM2: Motion-Constrained Video Object Tracking in Satellite Imagery using Promptable SAM2 and Kalman Priors
SatSAM2: Motion-Constrained Video Object Tracking in Satellite Imagery using Promptable SAM2 and Kalman Priors
Ruijie Fan
Junyan Ye
Huan Chen
Z. Huang
Xiaolei Wang
Weijia Li
56
0
0
23 Nov 2025
Tracking and Segmenting Anything in Any Modality
Tracking and Segmenting Anything in Any Modality
Tianlu Zhang
Qiang Zhang
Guiguang Ding
Jungong Han
VOS
145
0
0
22 Nov 2025
Segment Anything Across Shots: A Method and Benchmark
Segment Anything Across Shots: A Method and Benchmark
Hengrui Hu
Kaining Ying
Henghui Ding
VOS
169
0
0
17 Nov 2025
Automatic Music Mixing using a Generative Model of Effect Embeddings
Automatic Music Mixing using a Generative Model of Effect Embeddings
Eloi Moliner
Marco A. Martínez-Ramírez
Junghyun Koo
Wei-Hsiang Liao
K. Cheuk
Joan Serrà
Vesa Valimaki
Yuki Mitsufuji
DiffM
106
0
0
11 Nov 2025
CoMA: Complementary Masking and Hierarchical Dynamic Multi-Window Self-Attention in a Unified Pre-training Framework
CoMA: Complementary Masking and Hierarchical Dynamic Multi-Window Self-Attention in a Unified Pre-training Framework
Jiaxuan Li
Qing Xu
Xiangjian He
Ziyu Liu
Chang Xing
Zhen Chen
Daokun Zhang
Rong Qu
Chang Wen Chen
56
0
0
08 Nov 2025
Towards Better Ultrasound Video Segmentation Foundation Model: An Empirical study on SAM2 Finetuning from Data Perspective
Towards Better Ultrasound Video Segmentation Foundation Model: An Empirical study on SAM2 Finetuning from Data Perspective
Xing Yao
Ahana Gangopadhyay
Hsi-Ming Chang
Ravi Soni
92
0
0
07 Nov 2025
Alias-Free ViT: Fractional Shift Invariance via Linear Attention
Alias-Free ViT: Fractional Shift Invariance via Linear Attention
H. Michaeli
Daniel Soudry
72
0
0
26 Oct 2025
EMA-SAM: Exponential Moving-average for SAM-based PTMC Segmentation
EMA-SAM: Exponential Moving-average for SAM-based PTMC Segmentation
Maryam Dialameh
Hossein Rajabzadeh
Jung Suk Sim
Hyock Ju Kwon
84
0
0
21 Oct 2025
Automated urban waterlogging assessment and early warning through a mixture of foundation models
Automated urban waterlogging assessment and early warning through a mixture of foundation models
Chenxu Zhang
Fuxiang Huang
Lei Zhang
100
0
0
21 Oct 2025
How Universal Are SAM2 Features?
How Universal Are SAM2 Features?
Masoud Khairi Atani
Alon Harell
Hyomin Choi
Runyu Yang
Fabien Racapé
Ivan V. Bajić
VLM
52
0
0
19 Oct 2025
StretchySnake: Flexible SSM Training Unlocks Action Recognition Across Spatio-Temporal Scales
StretchySnake: Flexible SSM Training Unlocks Action Recognition Across Spatio-Temporal Scales
Nyle Siddiqui
Rohit Gupta
S. Swetha
Mubarak Shah
108
0
0
17 Oct 2025
LSVOS 2025 Challenge Report: Recent Advances in Complex Video Object Segmentation
LSVOS 2025 Challenge Report: Recent Advances in Complex Video Object Segmentation
Chang Liu
Henghui Ding
Kaining Ying
Lingyi Hong
N. Xu
...
Alexey Nekrasov
Ali Athar
Daan de Geus
Alexander Hermans
Bastian Leibe
VOS
70
1
0
13 Oct 2025
MSM-Seg: A Modality-and-Slice Memory Framework with Category-Agnostic Prompting for Multi-Modal Brain Tumor Segmentation
MSM-Seg: A Modality-and-Slice Memory Framework with Category-Agnostic Prompting for Multi-Modal Brain Tumor Segmentation
Yuxiang Luo
Qing Xu
Hai Huang
Yuqi Ouyang
Zhen Chen
Wenting Duan
92
0
0
12 Oct 2025
SPEGNet: Synergistic Perception-Guided Network for Camouflaged Object Detection
SPEGNet: Synergistic Perception-Guided Network for Camouflaged Object Detection
Baber Jan
Saeed Anwar
Aiman El-Maleh
Abdul Jabbar Siddiqui
Abdul Bais
60
0
0
06 Oct 2025
EmbodiSwap for Zero-Shot Robot Imitation Learning
EmbodiSwap for Zero-Shot Robot Imitation Learning
Eadom Dessalene
P. Mantripragada
Michael Maynord
Yiannis Aloimonos
LM&Ro
80
0
0
04 Oct 2025
CubistMerge: Spatial-Preserving Token Merging For Diverse ViT Backbones
CubistMerge: Spatial-Preserving Token Merging For Diverse ViT Backbones
Wenyi Gong
Mieszko Lis
83
0
0
26 Sep 2025
SAMSON: 3rd Place Solution of LSVOS 2025 VOS Challenge
SAMSON: 3rd Place Solution of LSVOS 2025 VOS Challenge
Yujie Xie
Hongyang Zhang
Zhihui Liu
Shihai Ruan
60
0
0
22 Sep 2025
UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning
UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning
Ye Liu
Zongyang Ma
Junfu Pu
Zhongang Qi
Yang Wu
Mingyu Ding
Chang Wen Chen
MLLMObjDLRM
231
1
0
22 Sep 2025
RangeSAM: On the Potential of Visual Foundation Models for Range-View represented LiDAR segmentation
RangeSAM: On the Potential of Visual Foundation Models for Range-View represented LiDAR segmentation
Paul Julius Kühn
Duc Anh Nguyen
Arjan Kuijper
Holger Graf
Dieter W. Fellner
3DPC
169
0
0
19 Sep 2025
Enriched Feature Representation and Motion Prediction Module for MOSEv2 Track of 7th LSVOS Challenge: 3rd Place Solution
Enriched Feature Representation and Motion Prediction Module for MOSEv2 Track of 7th LSVOS Challenge: 3rd Place Solution
Chang Soo Lim
Joonyoung Moon
Donghyeon Cho
56
0
0
19 Sep 2025
Road Obstacle Video Segmentation
Road Obstacle Video Segmentation
Shyam Nandan Rai
Shyamgopal Karthik
Mariana-Iuliana Georgescu
Barbara Caputo
Carlo Masone
Zeynep Akata
VOS
145
0
0
16 Sep 2025
FS-SAM2: Adapting Segment Anything Model 2 for Few-Shot Semantic Segmentation via Low-Rank Adaptation
FS-SAM2: Adapting Segment Anything Model 2 for Few-Shot Semantic Segmentation via Low-Rank Adaptation
Bernardo Forni
Gabriele Lombardi
Federico Pozzi
Mirco Planamente
VLM
72
0
0
15 Sep 2025
Improving Video Diffusion Transformer Training by Multi-Feature Fusion and Alignment from Self-Supervised Vision Encoders
Improving Video Diffusion Transformer Training by Multi-Feature Fusion and Alignment from Self-Supervised Vision Encoders
Dohun Lee
Hyeonho Jeong
Jiwook Kim
Duygu Ceylan
J. C. Ye
VGen
81
0
0
11 Sep 2025
Co-Seg: Mutual Prompt-Guided Collaborative Learning for Tissue and Nuclei Segmentation
Co-Seg: Mutual Prompt-Guided Collaborative Learning for Tissue and Nuclei SegmentationInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Qing Xu
Wenting Duan
Zhen Chen
60
2
0
08 Sep 2025
FreeVPS: Repurposing Training-Free SAM2 for Generalizable Video Polyp Segmentation
FreeVPS: Repurposing Training-Free SAM2 for Generalizable Video Polyp Segmentation
Qiang Hu
Ying Zhou
Gepeng Ji
Nick Barnes
Qiang Li
Zhiwei Wang
60
0
0
27 Aug 2025
NAT: Learning to Attack Neurons for Enhanced Adversarial Transferability
NAT: Learning to Attack Neurons for Enhanced Adversarial TransferabilityIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2025
Krishna Kanth Nakka
Alexandre Alahi
AAML
86
2
0
23 Aug 2025
Temporal Grounding as a Learning Signal for Referring Video Object Segmentation
Temporal Grounding as a Learning Signal for Referring Video Object Segmentation
Seunghun Lee
Jiwan Seo
Jeonghoon Kim
S. Kim
Siwon Kim
...
Wonhyeok Choi
Jaehoon Jeong
Zane Durante
Sang Hyun Park
Sunghoon Im
VOS
128
0
0
16 Aug 2025
Privacy-enhancing Sclera Segmentation Benchmarking Competition: SSBC 2025
Privacy-enhancing Sclera Segmentation Benchmarking Competition: SSBC 2025
Matej Vitek
Darian Tomašević
Abhijit Das
Sabari Nathan
Gökhan Özbulak
...
Raghavendra Ramachandra
Aditya Nigam
Umapada Pal
Peter Peer
Vitomir Štruc
80
0
0
14 Aug 2025
SAM2-UNeXT: An Improved High-Resolution Baseline for Adapting Foundation Models to Downstream Segmentation Tasks
SAM2-UNeXT: An Improved High-Resolution Baseline for Adapting Foundation Models to Downstream Segmentation Tasks
Xinyu Xiong
Zihuang Wu
L. Zhang
Lei Lu
Ming-hui Li
Guanbin Li
84
1
0
05 Aug 2025
H3R: Hybrid Multi-view Correspondence for Generalizable 3D Reconstruction
H3R: Hybrid Multi-view Correspondence for Generalizable 3D Reconstruction
Heng Jia
Linchao Zhu
Na Zhao
3DGS
114
0
0
05 Aug 2025
COSTARR: Consolidated Open Set Technique with Attenuation for Robust Recognition
COSTARR: Consolidated Open Set Technique with Attenuation for Robust Recognition
Ryan Rabinowitz
Steve Cruz
Walter J. Scheirer
Terrance Boult
BDL
94
0
0
01 Aug 2025
IAMAP: Unlocking Deep Learning in QGIS for non-coders and limited computing resources
IAMAP: Unlocking Deep Learning in QGIS for non-coders and limited computing resources
Paul Tresson
Pierre Le Coz
Hadrien Tulet
Anthony Malkassian
Maxime Réjou Méchain
VLM
44
0
0
01 Aug 2025
Towards Blind Bitstream-corrupted Video Recovery via a Visual Foundation Model-driven Framework
Towards Blind Bitstream-corrupted Video Recovery via a Visual Foundation Model-driven Framework
Tianyi Liu
Kejun Wu
Chen Cai
Yi Wang
Kim-Hui Yap
Lap-Pui Chau
82
0
0
30 Jul 2025
Object-centric Video Question Answering with Visual Grounding and Referring
Object-centric Video Question Answering with Visual Grounding and Referring
Haochen Wang
Qirui Chen
Cilin Yan
Jiayin Cai
Xiaolong Jiang
Yao Hu
Weidi Xie
Stratis Gavves
MLLMVOS
164
3
0
25 Jul 2025
Benchmarking pig detection and tracking under diverse and challenging conditions
Benchmarking pig detection and tracking under diverse and challenging conditions
Jonathan Henrich
Christian Post
Maximilian Zilke
Parth Shiroya
Emma Chanut
Amir Mollazadeh Yamchi
Ramin Yahyapour
Thomas Kneib
Imke Traulsen
144
0
0
22 Jul 2025
HiM2SAM: Enhancing SAM2 with Hierarchical Motion Estimation and Memory Optimization towards Long-term Tracking
HiM2SAM: Enhancing SAM2 with Hierarchical Motion Estimation and Memory Optimization towards Long-term Tracking
Ruixiang Chen
Guolei Sun
Yawei Li
Jie Qin
Luca Benini
233
1
0
10 Jul 2025
OpenWorldSAM: Extending SAM2 for Universal Image Segmentation with Language Prompts
OpenWorldSAM: Extending SAM2 for Universal Image Segmentation with Language Prompts
Shiting Xiao
Rishabh Kabra
Yuhang Li
Donghyun Lee
João Carreira
Priyadarshini Panda
VLM
147
0
0
07 Jul 2025
Visual Anagrams Reveal Hidden Differences in Holistic Shape Processing Across Vision Models
Visual Anagrams Reveal Hidden Differences in Holistic Shape Processing Across Vision Models
Fenil R. Doshi
Thomas Fel
Talia Konkle
George A. Alvarez
130
0
0
01 Jul 2025
PicoSAM2: Low-Latency Segmentation In-Sensor for Edge Vision Applications
PicoSAM2: Low-Latency Segmentation In-Sensor for Edge Vision Applications
Pietro Bonazzi
Nicola Farronato
Stefan Zihlmann
Haotong Qin
Michele Magno
VLM
143
2
0
23 Jun 2025
Seeing What Matters: Generalizable AI-generated Video Detection with Forensic-Oriented Augmentation
Seeing What Matters: Generalizable AI-generated Video Detection with Forensic-Oriented Augmentation
Riccardo Corvi
D. Cozzolino
Ekta Prashnani
Shalini De Mello
Koki Nagano
L. Verdoliva
ViT
276
1
0
20 Jun 2025
Co-Seg++: Mutual Prompt-Guided Collaborative Learning for Versatile Medical Segmentation
Co-Seg++: Mutual Prompt-Guided Collaborative Learning for Versatile Medical Segmentation
Qing Xu
Yuxiang Luo
Wenting Duan
Zhen Chen
170
3
0
20 Jun 2025
Action Dubber: Timing Audible Actions via Inflectional Flow
Action Dubber: Timing Audible Actions via Inflectional Flow
Wenlong Wan
Weiying Zheng
Tianyi Xiang
Guiqing Li
Shengfeng He
129
0
0
16 Jun 2025
Fine-Grained Spatially Varying Material Selection in Images
Julia Guerrero-Viu
Michael Fischer
Iliyan Georgiev
Elena Garces
Diego F. F. Gutierrez
B. Masiá
Valentin Deschaintre
147
0
0
10 Jun 2025
ReStNet: A Reusable & Stitchable Network for Dynamic Adaptation on IoT Devices
ReStNet: A Reusable & Stitchable Network for Dynamic Adaptation on IoT Devices
Maoyu Wang
Yao Lu
Jiaqi Nie
Zeyu Wang
Yun Lin
Qi Xuan
Guan Gui
127
0
0
08 Jun 2025
THU-Warwick Submission for EPIC-KITCHEN Challenge 2025: Semi-Supervised Video Object Segmentation
THU-Warwick Submission for EPIC-KITCHEN Challenge 2025: Semi-Supervised Video Object Segmentation
Mingqi Gao
Haoran Duan
Tianlu Zhang
Jungong Han
90
0
0
07 Jun 2025
Zero-Shot Tree Detection and Segmentation from Aerial Forest Imagery
Zero-Shot Tree Detection and Segmentation from Aerial Forest Imagery
Michelle Chen
David Russell
Amritha Pallavoor
Derek Young
Jane Wu
VLM
158
2
0
03 Jun 2025
Go Beyond Earth: Understanding Human Actions and Scenes in Microgravity Environments
Go Beyond Earth: Understanding Human Actions and Scenes in Microgravity Environments
Di Wen
Lei Qi
Kunyu Peng
Kailun Yang
Fei Teng
...
Yufan Chen
R. Liu
Yitian Shi
M. Sarfraz
Rainer Stiefelhagen
314
0
0
03 Jun 2025
AuralSAM2: Enabling SAM2 Hear Through Pyramid Audio-Visual Feature Prompting
AuralSAM2: Enabling SAM2 Hear Through Pyramid Audio-Visual Feature Prompting
Yuyuan Liu
Yuanhong Chen
Chong Wang
Junlin Han
Junde Wu
Can Peng
Jingkun Chen
Yu Tian
Gustavo Carneiro
VLM
239
0
0
01 Jun 2025
Towards Scalable Language-Image Pre-training for 3D Medical Imaging
Towards Scalable Language-Image Pre-training for 3D Medical Imaging
Chenhui Zhao
Yiwei Lyu
Asadur Chowdury
Edward Harake
A. Kondepudi
Akshay Rao
X. Hou
Honglak Lee
Rui Feng
MedImLM&MA
162
0
0
28 May 2025
SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation
SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation
Claudia Cuttano
Gabriele Trivigno
Giuseppe Averta
Carlo Masone
VLM
183
0
0
27 May 2025
1234
Next