Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.02643
Cited By
Segment Anything
5 April 2023
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
Laura Gustafson
Tete Xiao
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
MLLM
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Segment Anything"
50 / 1,373 papers shown
Title
Can Foundation Models Really Segment Tumors? A Benchmarking Odyssey in Lung CT Imaging
Elena Mulero Ayllón
Massimiliano Mantegna
Linlin Shen
Paolo Soda
V. Guarrasi
M. Tortora
87
0
0
02 May 2025
Visual Test-time Scaling for GUI Agent Grounding
Tiange Luo
Lajanugen Logeswaran
Justin Johnson
Honglak Lee
138
0
0
01 May 2025
Cues3D: Unleashing the Power of Sole NeRF for Consistent and Unique Instances in Open-Vocabulary 3D Panoptic Segmentation
Feng Xue
Wenzhuang Xu
Guofeng Zhong
Anlong Minga
N. Sebe
138
0
0
01 May 2025
SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models
Wufei Ma
Luoxin Ye
Nessa McWeeney
Celso M de Melo
Jieneng Chen
LRM
164
1
0
01 May 2025
InstructAttribute: Fine-grained Object Attributes editing with Instruction
Xingxi Yin
Jingfeng Zhang
Zhi Li
You Li
Yanzhe Zhang
Yin Zhang
DiffM
488
1
0
01 May 2025
Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models
Guanghao Zhou
Panjia Qiu
Chong Chen
Jiadong Wang
Zheming Yang
Jian Xu
Minghui Qiu
OffRL
LRM
222
8
0
30 Apr 2025
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction
Qihao Liu
Ju He
Qihang Yu
Liang-Chieh Chen
Alan Yuille
DiffM
VGen
189
1
0
30 Apr 2025
Detecting and Mitigating Hateful Content in Multimodal Memes with Vision-Language Models
Minh-Hao Van
Xintao Wu
VLM
170
0
0
30 Apr 2025
Mcity Data Engine: Iterative Model Improvement Through Open-Vocabulary Data Selection
Daniel Bogdoll
Rajanikant Ananta
Abeyankar Giridharan
Isabel Moore
Gregory Stevens
Henry X. Liu
VLM
123
0
0
30 Apr 2025
Adapting In-Domain Few-Shot Segmentation to New Domains without Retraining
Qi Fan
Kaiqi Liu
Nian Liu
Hisham Cholakkal
Rao Muhammad Anwer
Wenbin Li
Yang Gao
247
0
0
30 Apr 2025
UniBiomed: A Universal Foundation Model for Grounded Biomedical Image Interpretation
Linshan Wu
Yuxiang Nie
Sunan He
Jiaxin Zhuang
Hao Chen
...
V. Vardhanabhuti
R. Chan
Yifan Peng
Pranav Rajpurkar
Hao Chen
LM&MA
MedIm
215
0
0
30 Apr 2025
XPG-RL: Reinforcement Learning with Explainable Priority Guidance for Efficiency-Boosted Mechanical Search
Yiting Zhang
Shichen Li
Elena Shrestha
104
1
0
29 Apr 2025
Style-Adaptive Detection Transformer for Single-Source Domain Generalized Object Detection
Jianhong Han
Yupei Wang
Liang Chen
ViT
112
0
0
29 Apr 2025
Do You Know the Way? Human-in-the-Loop Understanding for Fast Traversability Estimation in Mobile Robotics
Andre Schreiber
Katherine Rose Driggs-Campbell
486
0
0
28 Apr 2025
QuickGrasp: Lightweight Antipodal Grasp Planning with Point Clouds
Navin Sriram Ravie
Keerthi Vasan M
Asokan Thondiyath
Bijo Sebastian
125
0
0
28 Apr 2025
Pixels2Points: Fusing 2D and 3D Features for Facial Skin Segmentation
Victoria Yue Chen
Daoye Wang
Stephan Garbin
Jan Bednarík
Sebastian Winberg
Timo Bolkart
Thabo Beeler
3DH
3DPC
143
0
0
28 Apr 2025
Lightweight Adapter Learning for More Generalized Remote Sensing Change Detection
Dou Quan
Rufan Zhou
Shuang Wang
Ning Huyan
Dong Zhao
Yunan Li
L. Jiao
143
0
0
28 Apr 2025
TransparentGS: Fast Inverse Rendering of Transparent Objects with Gaussians
Letian Huang
Dongwei Ye
Jialin Dan
Chengzhi Tao
Huiwen Liu
Kun Zhou
Bo Ren
You Li
Yanwen Guo
Jie Guo
158
1
0
26 Apr 2025
RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning
Haoran Geng
Feishi Wang
Songlin Wei
Yuchen Li
Bangjun Wang
...
Hao Dong
Siyuan Huang
Yue Wang
Jitendra Malik
Pieter Abbeel
232
8
0
26 Apr 2025
HRScene: How Far Are VLMs from Effective High-Resolution Image Understanding?
Yusen Zhang
Wenliang Zheng
Aashrith Madasu
Peng Shi
Ryo Kamoi
...
Ranran Haoran Zhang
Avitej Iyer
Renze Lou
Wenpeng Yin
Rui Zhang
339
0
0
25 Apr 2025
Unsupervised Visual Chain-of-Thought Reasoning via Preference Optimization
Kesen Zhao
B. Zhu
Qianru Sun
Hanwang Zhang
MLLM
LRM
175
1
0
25 Apr 2025
Object Pose Estimation by Camera Arm Control Based on the Next Viewpoint Estimation
Tomoki Mizuno
Kazuya Yabashi
Tsuyoshi Tasaki
81
0
0
24 Apr 2025
RGB-D Video Object Segmentation via Enhanced Multi-store Feature Memory
Boyue Xu
Ruichao Hou
Tongwei Ren
Gangshan Wu
VOS
159
1
0
23 Apr 2025
Prompt-Tuning SAM: From Generalist to Specialist with only 2048 Parameters and 16 Training Images
Tristan Piater
Björn Barz
Alexander Freytag
VLM
MedIm
146
0
0
23 Apr 2025
DINOv2-powered Few-Shot Semantic Segmentation: A Unified Framework via Cross-Model Distillation and 4D Correlation Mining
Wei Zhuo
Zhiyue Tang
Wufeng Xue
Hao Ding
Linlin Shen
128
0
0
22 Apr 2025
FreeGraftor: Training-Free Cross-Image Feature Grafting for Subject-Driven Text-to-Image Generation
Zebin Yao
Lujie Niu
Huixing Jiang
Chen Wei
Fangkun Zhao
Ruifan Li
Fangxiang Feng
DiffM
195
0
0
22 Apr 2025
Context Aware Grounded Teacher for Source Free Object Detection
Tajamul Ashraf
Rajes Manna
Partha Sarathi Purkayastha
Tavaheed Tariq
Janibul Bashir
124
0
0
21 Apr 2025
DSPO: Direct Semantic Preference Optimization for Real-World Image Super-Resolution
Miaomiao Cai
Simiao Li
Wei Li
X. Y. Huang
Hanting Chen
Jie Hu
Yunhe Wang
87
1
0
21 Apr 2025
DRAWER: Digital Reconstruction and Articulation With Environment Realism
Hongchi Xia
Entong Su
Marius Memmel
Arhan Jain
Raymond Yu
Numfor Mbiziwo-Tiapo
Ali Farhadi
Abhishek Gupta
Shenlong Wang
Wei-Chiu Ma
VGen
126
1
0
21 Apr 2025
Object-Level Verbalized Confidence Calibration in Vision-Language Models via Semantic Perturbation
Yunpu Zhao
Rui Zhang
Junbin Xiao
Ruibo Hou
Jiaming Guo
Zihao Zhang
Yifan Hao
Yunji Chen
91
1
0
21 Apr 2025
AGI-Driven Generative Semantic Communications: Principles and Practices
Xiaojun Yuan
Haoming Ma
Yinuo Huang
Zhoufan Hua
Yong Zuo
Z. Ding
AI4CE
93
0
0
21 Apr 2025
SG-Reg: Generalizable and Efficient Scene Graph Registration
Chuhao Liu
Zhijian Qiao
Jieqi Shi
Ke Wang
Peize Liu
Shaojie Shen
140
0
0
20 Apr 2025
LGD: Leveraging Generative Descriptions for Zero-Shot Referring Image Segmentation
Jiachen Li
Qing Xie
Xiaohan Yu
Hongyun Wang
Jinyu Xu
Yongjian Liu
ObjD
178
0
0
20 Apr 2025
Exploring Modality Guidance to Enhance VFM-based Feature Fusion for UDA in 3D Semantic Segmentation
Johannes Spoecklberger
W. Lin
Pedro Hermosilla
Sivan Doveh
Horst Possegger
M. Jehanzeb Mirza
92
0
0
19 Apr 2025
Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D
Sergio Arnaud
Paul Mcvay
Ada Martin
Arjun Majumdar
Krishna Murthy Jatavallabhula
...
Nicolas Ballas
Mido Assran
Oleksandr Maksymets
Aravind Rajeswaran
Franziska Meier
3DPC
96
2
0
19 Apr 2025
LMPOcc: 3D Semantic Occupancy Prediction Utilizing Long-Term Memory Prior from Historical Traversals
Shanshuai Yuan
Julong Wei
Muer Tie
Xiangyun Ren
Zhongxue Gan
Wenchao Ding
103
0
0
18 Apr 2025
Beyond One-Hot Labels: Semantic Mixing for Model Calibration
Haoyang Luo
Linwei Tao
Minjing Dong
Chang Xu
160
0
0
18 Apr 2025
CM3AE: A Unified RGB Frame and Event-Voxel/-Frame Pre-training Framework
Wentao Wu
Xinyu Wang
Chenglong Li
Bo Jiang
Jin Tang
Bin Luo
Qi Liu
117
0
0
17 Apr 2025
Putting the Segment Anything Model to the Test with 3D Knee MRI - A Comparison with State-of-the-Art Performance
Oliver Mills
Philip G. Conaghan
Nishant Ravikumar
Samuel D. Relton
MedIm
136
0
0
17 Apr 2025
Perception Encoder: The best visual embeddings are not at the output of the network
Daniel Bolya
Po-Yao (Bernie) Huang
Peize Sun
Jang Hyun Cho
Andrea Madotto
...
Shiyu Dong
Nikhila Ravi
Daniel Li
Piotr Dollár
Christoph Feichtenhofer
ObjD
VOS
340
9
0
17 Apr 2025
Stronger, Steadier & Superior: Geometric Consistency in Depth VFM Forges Domain Generalized Semantic Segmentation
Siyu Chen
Ting Han
Changshe Zhang
Xin Luo
Meiliu Wu
Guorong Cai
Jinhe Su
MDE
135
1
0
17 Apr 2025
Privacy-Preserving Operating Room Workflow Analysis using Digital Twins
Alejandra Perez
Han-shen Zhang
Yu-Chun Ku
Lalithkumar Seenivasan
Roger Soberanis
Jose L. Porras
Richard Day
Jeff Jopling
Peter Najjar
Mathias Unberath
80
0
0
17 Apr 2025
Mask Image Watermarking
Runyi Hu
Jie Zhang
Shiqian Zhao
Nils Lukas
Jiwei Li
Qing Guo
Han Qiu
Tianwei Zhang
141
1
0
17 Apr 2025
GeoSense: Evaluating Identification and Application of Geometric Principles in Multimodal Reasoning
Liangyu Xu
Yingxiu Zhao
Jiadong Wang
Yingyao Wang
Bu Pi
...
Jihao Gu
Xinfeng Li
Xiaoyong Zhu
Jun Song
Jian Xu
LRM
525
6
0
17 Apr 2025
Post-Hurricane Debris Segmentation Using Fine-Tuned Foundational Vision Models
Kooshan Amini
Yuhao Liu
Jamie Ellen Padgett
Guha Balakrishnan
Ashok Veeraraghavan
91
0
0
17 Apr 2025
Zooming In on Fakes: A Novel Dataset for Localized AI-Generated Image Detection with Forgery Amplification Approach
Lvpan Cai
Haowei Wang
Jiayi Ji
YanShu ZhouMen
Yiwei Ma
Xiaoshuai Sun
Liujuan Cao
Rongrong Ji
ViT
100
1
0
16 Apr 2025
Real-World Depth Recovery via Structure Uncertainty Modeling and Inaccurate GT Depth Fitting
Delong Suzhang
Meng Yang
58
0
0
16 Apr 2025
MediSee: Reasoning-based Pixel-level Perception in Medical Images
Qinyue Tong
Ziqian Lu
Jun Liu
Yangming Zheng
Zheming Lu
LRM
151
0
0
15 Apr 2025
FACT: Foundation Model for Assessing Cancer Tissue Margins with Mass Spectrometry
Mohammad Farahmand
A. Jamzad
Fahimeh Fooladgar
Laura Connolly
Martin Kaufmann
Kevin Yi Mi Ren
John Rudan
Doug McKay
Gabor Fichtinger
P. Mousavi
113
0
0
15 Apr 2025
Reimagining Urban Science: Scaling Causal Inference with Large Language Models
Yutong Xia
Ao Qu
Yunhan Zheng
Yihong Tang
Dingyi Zhuang
...
Cathy Wu
Roger Zimmermann
Lijun Sun
Roger Zimmermann
Jinhua Zhao
AI4CE
415
2
0
15 Apr 2025
Previous
1
2
3
...
5
6
7
...
26
27
28
Next