Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.02643
Cited By
Segment Anything
5 April 2023
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
Laura Gustafson
Tete Xiao
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
MLLM
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Segment Anything"
50 / 1,373 papers shown
Title
Shape and Texture Recognition in Large Vision-Language Models
Sagi Eppel
Mor Bismut
Alona Faktor
3DV
VLM
114
2
0
29 Mar 2025
ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation
Yunhong Min
Daehyeon Choi
Kyeongmin Yeo
Jihyun Lee
Minhyuk Sung
118
0
0
28 Mar 2025
TranSplat: Lighting-Consistent Cross-Scene Object Transfer with 3D Gaussian Splatting
Boyang
Yanlin Jin
Ashok Veeraraghavan
Akshat Dave
Guha Balakrishnan
3DGS
135
0
0
28 Mar 2025
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis
J. Huang
Baoxiong Jia
Yansen Wang
Ziyu Zhu
Xiongkun Linghu
Qing Li
Song-Chun Zhu
Siyuan Huang
191
5
0
28 Mar 2025
Deep Depth Estimation from Thermal Image: Dataset, Benchmark, and Challenges
Ukcheol Shin
Jinsun Park
3DV
MDE
88
0
0
28 Mar 2025
A Unified Image-Dense Annotation Generation Model for Underwater Scenes
Hongkai Lin
Dingkang Liang
Zhenghao Qi
X. Bai
DiffM
92
0
0
27 Mar 2025
Foveated Instance Segmentation
Hongyi Zeng
Wenxuan Liu
Tianhua Xia
Jintai Chen
Ziyun Li
Sai Qian Zhang
ISeg
137
0
0
27 Mar 2025
Context-Aware Weakly Supervised Image Manipulation Localization with SAM Refinement
Xinghao Wang
Changtao Miao
Dianmo Sheng
Tao Gong
Qi Chu
135
0
0
26 Mar 2025
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields
Shijie Zhou
Hui Ren
Yijia Weng
Shuwang Zhang
Zhen Wang
...
Zhiwen Fan
Suya You
Ziyi Wang
Leonidas Guibas
A. Kadambi
VGen
3DGS
160
2
0
26 Mar 2025
Zero-Shot Human-Object Interaction Synthesis with Multimodal Priors
Yuke Lou
Yiming Wang
Zhen Wu
Rui Zhao
Wenjia Wang
Mingyi Shi
Taku Komura
99
2
0
25 Mar 2025
Optimization of MedSAM model based on bounding box adaptive perturbation algorithm
Boyi Li
Ye Yuan
Wenjun Tan
AAML
MedIm
88
0
0
25 Mar 2025
DeClotH: Decomposable 3D Cloth and Human Body Reconstruction from a Single Image
Hyeongjin Nam
Donghwan Kim
Jeongtaek Oh
Kyoung Mu Lee
DiffM
3DH
93
1
0
25 Mar 2025
CamSAM2: Segment Anything Accurately in Camouflaged Videos
Yuli Zhou
Guolei Sun
Yawei Li
Yuqian Fu
Luca Benini
Ender Konukoglu
90
1
0
25 Mar 2025
BiPrompt-SAM: Enhancing Image Segmentation via Explicit Selection between Point and Text Prompts
Suzhe Xu
Jialin Peng
Chengyuan Zhang
VLM
160
0
0
25 Mar 2025
Tiling artifacts and trade-offs of feature normalization in the segmentation of large biological images
Elena Buglakova
Anwai Archit
Edoardo DÍmprima
Julia Mahamid
Constantin Pape
Anna Kreshuk
113
0
0
25 Mar 2025
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing
Jaihoon Kim
Taehoon Yoon
Jisung Hwang
Minhyuk Sung
DiffM
181
3
0
25 Mar 2025
Show and Segment: Universal Medical Image Segmentation via In-Context Learning
Yunhe Gao
Di Liu
Zhuowei Li
You Li
DongDong Chen
Mu Zhou
Dimitris N. Metaxas
VLM
88
0
0
25 Mar 2025
MaSS13K: A Matting-level Semantic Segmentation Benchmark
C. Xie
Minghan Li
Hui Zeng
Jun Luo
Lei Zhang
VLM
176
0
0
24 Mar 2025
EgoSurgery-HTS: A Dataset for Egocentric Hand-Tool Segmentation in Open Surgery Videos
Nathan Darjana
Ryo Fujii
Hideo Saito
Hiroki Kajita
121
0
0
24 Mar 2025
FisherTune: Fisher-Guided Robust Tuning of Vision Foundation Models for Domain Generalized Segmentation
Dong Zhao
Jinlong Li
Shuang Wang
Mengyao Wu
Qi Zang
N. Sebe
Zhun Zhong
495
1
0
23 Mar 2025
SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining
Yue Li
Qi Ma
Runyi Yang
Huapeng Li
Mengjiao Ma
...
E. Konukoglu
Theo Gevers
Luc Van Gool
Martin R. Oswald
Danda Pani Paudel
3DGS
VLM
237
2
0
23 Mar 2025
PG-SAM: Prior-Guided SAM with Medical for Multi-organ Segmentation
Yiheng Zhong
Zihong Luo
Chengzhi Liu
Feilong Tang
Zelin Peng
Ming Hu
Yitao Hu
Jionglong Su
Zongyuan Geand
Imran Razzak
MedIm
113
0
0
23 Mar 2025
GS-LTS: 3D Gaussian Splatting-Based Adaptive Modeling for Long-Term Service Robots
Bin Fu
Jiajian Li
Bin Zhang
Ruiping Wang
Xilin Chen
3DGS
100
0
0
22 Mar 2025
RefCut: Interactive Segmentation with Reference Guidance
Zheng Lin
Nan Zhou
Chen-Xi Du
Deng-Ping Fan
Shi-Min Hu
125
0
0
22 Mar 2025
RAIDER: Tool-Equipped Large Language Model Agent for Robotic Action Issue Detection, Explanation and Recovery
Silvia Izquierdo-Badiola
Carlos Rizzo
Guillem Alenyà
LLMAG
LM&Ro
169
0
0
22 Mar 2025
Towards Automated Semantic Interpretability in Reinforcement Learning via Vision-Language Models
Zhaoxin Li
Zhang Xi-Jia
Batuhan Altundas
Letian Chen
Rohan R. Paleja
Matthew C. Gombolay
OffRL
80
0
0
20 Mar 2025
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Jinlong Li
Cristiano Saltori
Fabio Poiesi
N. Sebe
513
2
0
20 Mar 2025
QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge
Xuan Shen
Weize Ma
Jing Liu
Changdi Yang
Rui Ding
...
Wei Niu
Yanzhi Wang
Pu Zhao
Jun Lin
Jiuxiang Gu
MQ
99
0
0
20 Mar 2025
Transport-Related Surface Detection with Machine Learning: Analyzing Temporal Trends in Madrid and Vienna
Miguel Ureña Pliego
Rubén Martínez Marín
Nianfang Shi
Takeru Shibayama
Ulrich Leth
Miguel Marchamalo Sacristán
225
0
0
19 Mar 2025
Visual Position Prompt for MLLM based Visual Grounding
Wei Tang
Yanpeng Sun
Qinying Gu
Zechao Li
VLM
110
0
0
19 Mar 2025
SUM Parts: Benchmarking Part-Level Semantic Segmentation of Urban Meshes
Weixiao Gao
Liangliang Nan
H. Ledoux
3DV
3DPC
82
0
0
19 Mar 2025
Toward task-driven satellite image super-resolution
Maciej Ziaja
Pawel Kowaleczko
Daniel Kostrzewa
Nicolas Longépé
M. Kawulok
SupR
146
0
0
19 Mar 2025
Mapping Urban Villages in China: Progress and Challenges
Rui Cao
Wei Tu
Dongsheng Chen
Wenyu Zhang
AI4TS
99
0
0
18 Mar 2025
EIAD: Explainable Industrial Anomaly Detection Via Multi-Modal Large Language Models
Zongyun Zhang
Jiacheng Ruan
Xian Gao
Ting Liu
Yuzhuo Fu
138
2
0
18 Mar 2025
Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives
Sara Sarto
Marcella Cornia
Rita Cucchiara
92
1
0
18 Mar 2025
E-Values Expand the Scope of Conformal Prediction
Etienne Gauthier
Francis Bach
Michael I. Jordan
107
0
0
17 Mar 2025
SAM2 for Image and Video Segmentation: A Comprehensive Survey
Zhang Jiaxing
Tang Hao
VLM
117
1
0
17 Mar 2025
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
Tao Wang
Changxu Cheng
Lingfeng Wang
Senda Chen
Wuyue Zhao
VLM
112
1
0
17 Mar 2025
Integrating AI for Human-Centric Breast Cancer Diagnostics: A Multi-Scale and Multi-View Swin Transformer Framework
Farnoush Bayatmakou
Reza Taleei
Milad Amir Toutounchian
Arash Mohammadi
90
0
0
17 Mar 2025
DeGauss: Dynamic-Static Decomposition with Gaussian Splatting for Distractor-free 3D Reconstruction
Rui Wang
Q. Lohmeyer
Mirko Meboldt
Siyu Tang
3DGS
120
1
0
17 Mar 2025
Learning-based 3D Reconstruction in Autonomous Driving: A Comprehensive Survey
Liewen Liao
Weihao Yan
Ming Yang
Songan Zhang
3DV
205
0
0
17 Mar 2025
Segment Any-Quality Images with Generative Latent Space Enhancement
Guangqian Guo
Yoong Guo
Xuehui Yu
Wenbo Li
Yaoxing Wang
Shan Gao
VLM
225
0
0
16 Mar 2025
SPOC: Spatially-Progressing Object State Change Segmentation in Video
Priyanka Mandikal
Tushar Nagarajan
Alex Stoken
Zihui Xue
Kristen Grauman
84
0
0
15 Mar 2025
Open3DVQA: A Benchmark for Comprehensive Spatial Reasoning with Multimodal Large Language Model in Open Space
Weichen Zhang
Zile Zhou
Zhiheng Zheng
Chen Gao
Jinqiang Cui
Yongqian Li
Xinlei Chen
Xiao-Ping Zhang
LRM
141
5
0
14 Mar 2025
EmoAgent: A Multi-Agent Framework for Diverse Affective Image Manipulation
Qi Mao
Haobo Hu
Yujie He
Difei Gao
Haokun Chen
Libiao Jin
DiffM
91
0
0
14 Mar 2025
EgoSplat: Open-Vocabulary Egocentric Scene Understanding with Language Embedded 3D Gaussian Splatting
Di Li
Jie Feng
Jiahao Chen
Weisheng Dong
Guanbin Li
G. Shi
Licheng Jiao
3DGS
VLM
441
0
0
14 Mar 2025
COIN: Confidence Score-Guided Distillation for Annotation-Free Cell Segmentation
Sanghyun Jo
Seo Jin Lee
Seungwoo Lee
Seohyung Hong
Hyungseok Seo
Kyungsu Kim
88
0
0
14 Mar 2025
Bayesian Prompt Flow Learning for Zero-Shot Anomaly Detection
Zhen Qu
Xian Tao
Xinyi Gong
Shichen Qu
Qiyu Chen
Zhengtao Zhang
Xingang Wang
Guiguang Ding
VLM
186
1
0
13 Mar 2025
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
Wanhua Li
Renping Zhou
Jiawei Zhou
Yingwei Song
Johannes Herter
Minghan Qin
Gao Huang
Hanspeter Pfister
3DGS
VLM
163
3
0
13 Mar 2025
Unveiling the Invisible: Reasoning Complex Occlusions Amodally with AURA
Zhixuan Li
Hyunse Yoon
Sanghoon Lee
Weisi Lin
102
1
0
13 Mar 2025
Previous
1
2
3
...
7
8
9
...
26
27
28
Next