Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2306.00989
Cited By
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
International Conference on Machine Learning (ICML), 2023
1 June 2023
Chaitanya K. Ryali
Yuan-Ting Hu
Daniel Bolya
Chen Wei
Haoqi Fan
Po-Yao (Bernie) Huang
Vaibhav Aggarwal
Arkabandhu Chowdhury
Omid Poursaeed
Judy Hoffman
Jitendra Malik
Yanghao Li
Christoph Feichtenhofer
3DH
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Github (985★)
Papers citing
"Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles"
50 / 171 papers shown
Title
Spurious Privacy Leakage in Neural Networks
Chenxiang Zhang
Jun Pang
S. Mauw
283
0
0
26 May 2025
C3R: Channel Conditioned Cell Representations for unified evaluation in microscopy imaging
Umar Marikkar
Syed Sameed Husain
Muhammad Awais
Sara Atito
167
0
0
24 May 2025
Auto-nnU-Net: Towards Automated Medical Image Segmentation
Jannis Becktepe
Leona Hennig
Steffen Oeltze-Jafra
Marius Lindauer
447
1
0
22 May 2025
SAMba-UNet: SAM2-Mamba UNet for Cardiac MRI in Medical Robotic Perception
Guohao Huo
Ruiting Dai
Hao Tang
Hao Tang
Mamba
256
0
0
22 May 2025
Advancing Marine Research: UWSAM Framework and UIIS10K Dataset for Precise Underwater Instance Segmentation
Hua Li
Shijie Lian
Zhiyuan Li
Runmin Cong
Sam Kwong
Laurence Tianruo Yang
Weidong Zhang
Sam Kwong
VLM
285
1
0
21 May 2025
Towards Visuospatial Cognition via Hierarchical Fusion of Visual Experts
Qi Feng
LRM
286
5
0
18 May 2025
ReSurgSAM2: Referring Segment Anything in Surgical Video via Credible Long-term Tracking
International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Haofeng Liu
Mingqi Gao
Xuxiao Luo
Ziyue Wang
Guanyi Qin
Jinlin Wu
Yueming Jin
207
9
0
13 May 2025
TiMo: Spatiotemporal Foundation Model for Satellite Image Time Series
Xiaolei Qin
Haiyan Zhao
Jing Zhang
Fengxiang Wang
Xin Su
Bo Du
Liangpei Zhang
AI4TS
264
0
0
13 May 2025
H
3
^3
3
DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning
Yiyang Lu
Yufeng Tian
Zhecheng Yuan
Xinyu Wang
Pu Hua
Zhengrong Xue
Huazhe Xu
297
4
0
12 May 2025
ABS-Mamba: SAM2-Driven Bidirectional Spiral Mamba Network for Medical Image Translation
Feng Yuan
Yifan Gao
Wenbin Wu
Keqing Wu
Xiaotong Guo
Jie Jiang
Xin Gao
Mamba
191
2
0
12 May 2025
ORXE: Orchestrating Experts for Dynamically Configurable Efficiency
Qingyuan Wang
Guoxin Wang
B. Cardiff
Deepu John
199
0
0
07 May 2025
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
Wenyuan Xu
Shibiao Xu
ViT
1.0K
2
0
06 May 2025
Corner Cases: How Size and Position of Objects Challenge ImageNet-Trained Models
Mishal Fatima
Steffen Jung
Margret Keuper
191
1
0
06 May 2025
Accelerating Volumetric Medical Image Annotation via Short-Long Memory SAM 2
IEEE Transactions on Medical Imaging (IEEE TMI), 2025
Yuwen Chen
Zafer Yildiz
Qihang Li
Yaqian Chen
Haoyu Dong
Hanxue Gu
Nicholas Konz
Maciej A. Mazurowski
MedIm
VLM
370
1
0
03 May 2025
UniBiomed: A Universal Foundation Model for Grounded Biomedical Image Interpretation
Linshan Wu
Yuxiang Nie
Sunan He
Jiaxin Zhuang
Hao Chen
...
Hao Chen
Ronald Cheong Kin Chan
Yifan Peng
Pranav Rajpurkar
Hao Chen
LM&MA
MedIm
521
2
0
30 Apr 2025
DAM-Net: Domain Adaptation Network with Micro-Labeled Fine-Tuning for Change Detection
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (IEEE J-STARS), 2025
Ningyu Zhang
Xin Xu
Fangling Pu
181
0
0
18 Apr 2025
HSACNet: Hierarchical Scale-Aware Consistency Regularized Semi-Supervised Change Detection
Qiáo Xu
Pengfei Wang
Yanjun Li
Tianwen Qian
Xiaoling Wang
87
0
0
18 Apr 2025
Efficient Masked Image Compression with Position-Indexed Self-Attention
Chengjie Dai
Tiantian Song
Hui Tang
Fangdong Chen
Bowei Yang
Guanghua Song
145
0
0
17 Apr 2025
FocusedAD: Character-centric Movie Audio Description
Xiaojun Ye
C. Wang
Yiren Song
Sheng Zhou
Liangcheng Li
Jiajun Bu
VGen
303
4
0
16 Apr 2025
Multi-scale Activation, Refinement, and Aggregation: Exploring Diverse Cues for Fine-Grained Bird Recognition
Zhenru Zhang
Hao Tang
Jinhui Tang
151
0
0
12 Apr 2025
A Comparison of Deep Learning Methods for Cell Detection in Digital Cytology
Scandinavian Conference on Image Analysis (SCIA), 2025
Marco Acerbis
Natasa Sladoje
Patrick Micke
111
0
0
09 Apr 2025
S^4M: Boosting Semi-Supervised Instance Segmentation with SAM
Heeji Yoon
Heeseong Shin
Eunbeen Hong
Hyunwook Choi
Hansang Cho
Daun Jeong
Seungryong Kim
177
1
0
07 Apr 2025
Agglomerating Large Vision Encoders via Distillation for VFSS Segmentation
Chengxi Zeng
Yuxuan Jiang
Fan Zhang
A. Gambaruto
T. Burghardt
MedIm
217
2
0
03 Apr 2025
IMPACT: A Generic Semantic Loss for Multimodal Medical Image Registration
Valentin Boussot
Cédric Hémon
Jean-Claude Nunes
Jason Downling
Simon Rouzé
Caroline Lafond
Anaïs Barateau
Jean-Louis Dillenseger
326
2
0
31 Mar 2025
Multi-Task Learning for Extracting Menstrual Characteristics from Clinical Notes
Anna Shopova
Cristoph Lippert
Leslee J. Shaw
Eugenia Alleva
226
0
0
31 Mar 2025
Vision-to-Music Generation: A Survey
Zhaokai Wang
Chenxi Bao
Le Zhuo
Jingrui Han
Yang Yue
Yihong Tang
Victor Shea-Jay Huang
Yue Liao
EGVM
VGen
298
3
0
27 Mar 2025
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields
Computer Vision and Pattern Recognition (CVPR), 2025
Shijie Zhou
Hui Ren
Yijia Weng
Shuwang Zhang
Zhen Wang
...
Zhiwen Fan
Suya You
Ziyi Wang
Leonidas Guibas
A. Kadambi
VGen
3DGS
309
5
0
26 Mar 2025
ChA-MAEViT: Unifying Channel-Aware Masked Autoencoders and Multi-Channel Vision Transformers for Improved Cross-Channel Learning
Chau Pham
Juan C. Caicedo
Bryan A. Plummer
197
2
0
25 Mar 2025
RP-SAM2: Refining Point Prompts for Stable Surgical Instrument Segmentation
Nuren Zhaksylyk
Ibrahim Almakky
Jay N. Paranjape
S. Vedula
S. Sikder
Vishal M. Patel
Mohammad Yaqub
246
1
0
25 Mar 2025
CamSAM2: Segment Anything Accurately in Camouflaged Videos
Yuli Zhou
Guolei Sun
Yawei Li
Yuqian Fu
Luca Benini
Ender Konukoglu
259
4
0
25 Mar 2025
Beyond Accuracy: What Matters in Designing Well-Behaved Models?
Robin Hesse
Doğukan Bağcı
Bernt Schiele
Simone Schaub-Meyer
Stefan Roth
VLM
237
0
0
21 Mar 2025
SALT: Parameter-Efficient Fine-Tuning via Singular Value Adaptation with Low-Rank Transformation
Abdelrahman Elsayed
Sarim Hashmi
Mohammed Elseiagy
Hu Wang
Mohammad Yaqub
Ibrahim Almakky
OOD
273
1
0
20 Mar 2025
High Temporal Consistency through Semantic Similarity Propagation in Semi-Supervised Video Semantic Segmentation for Autonomous Flight
Computer Vision and Pattern Recognition (CVPR), 2025
Cédric Vincent
Taehyoung Kim
Henri Meeß
187
2
0
19 Mar 2025
SAM2-ELNet: Label Enhancement and Automatic Annotation for Remote Sensing Segmentation
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (IEEE J-STARS), 2025
Jianhao Yang
Wenshuo Yu
Yuanchao Lv
Jiance Sun
Bokang Sun
Mingyang Liu
202
1
0
16 Mar 2025
Segment Any-Quality Images with Generative Latent Space Enhancement
Computer Vision and Pattern Recognition (CVPR), 2025
Guangqian Guo
Yoong Guo
Xuehui Yu
Wenbo Li
Yaoxing Wang
Shan Gao
VLM
473
0
0
16 Mar 2025
SpaceSeg: A High-Precision Intelligent Perception Segmentation Method for Multi-Spacecraft On-Orbit Targets
Hao Liu
Pengyu Guo
Siyuan Yang
Zeqing Jiang
Qinglei Hu
Dongyu Li
103
1
0
14 Mar 2025
SignRep: Enhancing Self-Supervised Sign Representations
Ryan Wong
Necati Cihan Camgöz
Richard Bowden
SLR
339
3
0
11 Mar 2025
OmniSAM: Omnidirectional Segment Anything Model for UDA in Panoramic Semantic Segmentation
Ding Zhong
Xu Zheng
Chenfei Liao
Yuanhuiyi Lyu
Jialei Chen
Shengyang Wu
Linfeng Zhang
Xuming Hu
VLM
352
17
0
10 Mar 2025
RS2-SAM2: Customized SAM2 for Referring Remote Sensing Image Segmentation
Fu Rong
Meng Lan
Qian Zhang
Guang Dai
397
1
0
10 Mar 2025
MemorySAM: Memorize Modalities and Semantics with Segment Anything Model 2 for Multi-modal Semantic Segmentation
Chenfei Liao
Xu Zheng
Yuanhuiyi Lyu
Haiwei Xue
Yihong Cao
Jiawen Wang
Kailun Yang
Xuming Hu
VLM
389
10
0
09 Mar 2025
Boltzmann Attention Sampling for Image Analysis with Small Objects
Computer Vision and Pattern Recognition (CVPR), 2025
Theodore Zhao
Sid Kiblawi
Naoto Usuyama
Ho Hin Lee
Sam Preston
Hoifung Poon
Mu-Hsin Wei
MedIm
372
1
0
04 Mar 2025
Unveiling the Potential of Segment Anything Model 2 for RGB-Thermal Semantic Segmentation with Language Guidance
Jiayi Zhao
Fei Teng
Kai Luo
Guoqiang Zhao
Hui Yuan
Xu Zheng
Kailun Yang
VLM
310
9
0
04 Mar 2025
Enhancing 3D Gaze Estimation in the Wild using Weak Supervision with Gaze Following Labels
Computer Vision and Pattern Recognition (CVPR), 2025
Pierre Vuillecard
J. Odobez
164
3
0
27 Feb 2025
SentiFormer: Metadata Enhanced Transformer for Image Sentiment Analysis
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Bin Feng
Shulan Ruan
Mingzheng Yang
Dongxuan Han
Huijie Liu
Kai Zhang
Qi Liu
ViT
161
1
0
24 Feb 2025
Thicker and Quicker: A Jumbo Token for Fast Plain Vision Transformers
A. Fuller
Yousef Yassin
Daniel G. Kyrollos
Evan Shelhamer
James R. Green
375
1
0
20 Feb 2025
Spectral-factorized Positive-definite Curvature Learning for NN Training
Wu Lin
Felix Dangel
Runa Eschenhagen
Juhan Bae
Richard E. Turner
Roger B. Grosse
441
0
0
10 Feb 2025
Particle Trajectory Representation Learning with Masked Point Modeling
Sam Young
Yeon-jae Jwa
Kazuhiro Terao
3DPC
287
3
0
04 Feb 2025
Few-Shot Adaptation of Training-Free Foundation Model for 3D Medical Image Segmentation
Xingxin He
Yifan Hu
Zhaoye Zhou
Mohamed Jarraya
Fang Liu
VLM
MedIm
244
5
0
17 Jan 2025
EdgeTAM: On-Device Track Anything Model
Computer Vision and Pattern Recognition (CVPR), 2025
Chong Zhou
Chenchen Zhu
Yunyang Xiong
Saksham Suri
Fanyi Xiao
...
Raghuraman Krishnamoorthi
Bo Dai
Chen Change Loy
Vikas Chandra
Bilge Soran
VLM
268
6
0
13 Jan 2025
Finer: Investigating and Enhancing Fine-Grained Visual Concept Recognition in Large Vision Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Jeonghwan Kim
Heng Ji
MLLM
202
4
0
08 Jan 2025
Previous
1
2
3
4
Next