ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.02643
  4. Cited By
Segment Anything

Segment Anything

5 April 2023
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
Laura Gustafson
Tete Xiao
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
    MLLMVLM
ArXiv (abs)PDFHTML

Papers citing "Segment Anything"

50 / 1,376 papers shown
Title
Multi-Modal Artificial Intelligence of Embryo Grading and Pregnancy Prediction in Assisted Reproductive Technology: A Review
Multi-Modal Artificial Intelligence of Embryo Grading and Pregnancy Prediction in Assisted Reproductive Technology: A Review
Xueqiang Ouyang
Jia Wei
129
0
0
19 May 2025
SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning
SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning
Yang Liu
Ming Ma
Xiaomin Yu
Pengxiang Ding
Han Zhao
Mingyang Sun
Siteng Huang
Donglin Wang
LRM
227
0
0
18 May 2025
Experimental Study on Automatically Assembling Custom Catering Packages With a 3-DOF Delta Robot Using Deep Learning Methods
Reihaneh Yourdkhani
Arash Tavoosian
Navid Asadi Khomami
Mehdi Tale Masouleh
80
1
0
17 May 2025
PRS-Med: Position Reasoning Segmentation with Vision-Language Model in Medical Imaging
PRS-Med: Position Reasoning Segmentation with Vision-Language Model in Medical Imaging
Quoc-Huy Trinh
Minh-Van Nguyen
Jung Peng
Ulas Bagci
Debesh Jha
228
0
0
17 May 2025
Search-TTA: A Multimodal Test-Time Adaptation Framework for Visual Search in the Wild
Search-TTA: A Multimodal Test-Time Adaptation Framework for Visual Search in the Wild
Derek Ming Siang Tan
Shailesh
Boyang Liu
Alok Raj
Qi Xuan Ang
...
Tanishq Duhan
Jimmy Chiun
Yuhong Cao
Florian Shkurti
Guillaume Sartoretti
69
0
0
16 May 2025
SurgPose: Generalisable Surgical Instrument Pose Estimation using Zero-Shot Learning and Stereo Vision
SurgPose: Generalisable Surgical Instrument Pose Estimation using Zero-Shot Learning and Stereo Vision
Utsav Rai
Haozheng Xu
Stamatia Giannarou
MedIm
86
0
0
16 May 2025
Visual Fidelity Index for Generative Semantic Communications with Critical Information Embedding
Visual Fidelity Index for Generative Semantic Communications with Critical Information Embedding
Jianhao Huang
Qunsong Zeng
Kaibin Huang
DiffM
100
0
0
15 May 2025
Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis
Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis
Bingda Tang
Boyang Zheng
Xichen Pan
Sayak Paul
Saining Xie
90
0
0
15 May 2025
A Unified and Scalable Membership Inference Method for Visual Self-supervised Encoder via Part-aware Capability
A Unified and Scalable Membership Inference Method for Visual Self-supervised Encoder via Part-aware Capability
Jie Zhu
Jirong Zha
Ding Li
Leye Wang
171
1
0
15 May 2025
Advances in Radiance Field for Dynamic Scene: From Neural Field to Gaussian Field
Advances in Radiance Field for Dynamic Scene: From Neural Field to Gaussian Field
Jinlong Fan
Xuepu Zeng
Jing Zhang
Mingming Gong
Yuxiang Yang
Dacheng Tao
3DGSAI4CE
161
0
0
15 May 2025
Mini Diffuser: Fast Multi-task Diffusion Policy Training Using Two-level Mini-batches
Mini Diffuser: Fast Multi-task Diffusion Policy Training Using Two-level Mini-batches
Yutong Hu
Pinhao Song
Kehan Wen
Renaud Detry
VLM
97
0
0
14 May 2025
Beyond General Prompts: Automated Prompt Refinement using Contrastive Class Alignment Scores for Disambiguating Objects in Vision-Language Models
Beyond General Prompts: Automated Prompt Refinement using Contrastive Class Alignment Scores for Disambiguating Objects in Vision-Language Models
Lucas Choi
Ross Greer
VLM
156
0
0
14 May 2025
Leveraging Multi-Modal Information to Enhance Dataset Distillation
Leveraging Multi-Modal Information to Enhance Dataset Distillation
Zhe Li
Hadrien Reynaud
Bernhard Kainz
DD
111
0
0
13 May 2025
Leveraging Segment Anything Model for Source-Free Domain Adaptation via Dual Feature Guided Auto-Prompting
Leveraging Segment Anything Model for Source-Free Domain Adaptation via Dual Feature Guided Auto-Prompting
Zheang Huai
Hui Tang
Yi Li
Zhe Chen
Xiaomeng Li
VLM
198
0
0
13 May 2025
ReSurgSAM2: Referring Segment Anything in Surgical Video via Credible Long-term Tracking
ReSurgSAM2: Referring Segment Anything in Surgical Video via Credible Long-term Tracking
Haofeng Liu
Mingqi Gao
Xuxiao Luo
Ziyue Wang
Guanyi Qin
Jinlin Wu
Yueming Jin
93
1
0
13 May 2025
OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning
OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning
Zhaochen Su
Linjie Li
Mingyang Song
Yunzhuo Hao
Zhengyuan Yang
...
Guanjie Chen
Jiawei Gu
Juntao Li
Xiaoye Qu
Yu Cheng
OffRLLRM
101
11
0
13 May 2025
SLAG: Scalable Language-Augmented Gaussian Splatting
SLAG: Scalable Language-Augmented Gaussian Splatting
Laszlo Szilagyi
Francis Engelmann
Jeannette Bohg
3DGS
116
0
0
12 May 2025
CHD: Coupled Hierarchical Diffusion for Long-Horizon Tasks
CHD: Coupled Hierarchical Diffusion for Long-Horizon Tasks
Ce Hao
Anxing Xiao
Zhiwei Xue
Harold Soh
189
1
0
12 May 2025
MarkMatch: Same-Hand Stuffing Detection
MarkMatch: Same-Hand Stuffing Detection
Fei Zhao
Runlin Zhang
Chenyi Zhang
Nitesh Saxena
57
0
0
11 May 2025
CMD: Controllable Multiview Diffusion for 3D Editing and Progressive Generation
CMD: Controllable Multiview Diffusion for 3D Editing and Progressive Generation
Peng Li
Suizhi Ma
Jialiang Chen
Yuan Liu
Chen Zhang
Wei Xue
Wenhan Luo
Alla Sheffer
Wenping Wang
Yu Guo
DiffM
129
0
0
11 May 2025
X-Sim: Cross-Embodiment Learning via Real-to-Sim-to-Real
X-Sim: Cross-Embodiment Learning via Real-to-Sim-to-Real
Prithwish Dan
Kushal Kedia
Angela Chao
Edward Weiyi Duan
Maximus Adrian Pace
Wei-Chiu Ma
Sanjiban Choudhury
175
0
0
11 May 2025
UniDiffGrasp: A Unified Framework Integrating VLM Reasoning and VLM-Guided Part Diffusion for Open-Vocabulary Constrained Grasping with Dual Arms
UniDiffGrasp: A Unified Framework Integrating VLM Reasoning and VLM-Guided Part Diffusion for Open-Vocabulary Constrained Grasping with Dual Arms
Xueyang Guo
Hongwei Hu
Chengye Song
Jingshu Chen
Zilin Zhao
Yu Fu
Bowen Guan
Zhenze Liu
117
0
0
11 May 2025
Automating Infrastructure Surveying: A Framework for Geometric Measurements and Compliance Assessment Using Point Cloud Data
Automating Infrastructure Surveying: A Framework for Geometric Measurements and Compliance Assessment Using Point Cloud Data
A. Ghafourian
Andrew Lee
Dechen Gao
Tyler Beer
Kin Yen
Iman Soltani
80
0
0
09 May 2025
RefRef: A Synthetic Dataset and Benchmark for Reconstructing Refractive and Reflective Objects
RefRef: A Synthetic Dataset and Benchmark for Reconstructing Refractive and Reflective Objects
Yue Yin
Enze Tao
Weijian Deng
Dylan Campbell
99
0
0
09 May 2025
BrainSegDMlF: A Dynamic Fusion-enhanced SAM for Brain Lesion Segmentation
BrainSegDMlF: A Dynamic Fusion-enhanced SAM for Brain Lesion Segmentation
Haobo Wang
Yifeng Wu
Huimin Huang
Hongtao Wu
Jia-Xuan Jiang
...
Hao Zheng
Xian Wu
Yefeng Zheng
Jinping Xu
Jing Cheng
MedIm
116
0
0
09 May 2025
PromptIQ: Who Cares About Prompts? Let System Handle It -- A Component-Aware Framework for T2I Generation
PromptIQ: Who Cares About Prompts? Let System Handle It -- A Component-Aware Framework for T2I Generation
Nisan Chhetri
Arpan Sainju
56
0
0
09 May 2025
Adaptive Contextual Embedding for Robust Far-View Borehole Detection
Adaptive Contextual Embedding for Robust Far-View Borehole Detection
Xuesong Liu
Tianyu Hao
Emmett J. Ientilucci
80
0
0
08 May 2025
UncertainSAM: Fast and Efficient Uncertainty Quantification of the Segment Anything Model
UncertainSAM: Fast and Efficient Uncertainty Quantification of the Segment Anything Model
Timo Kaiser
Thomas Norrenbrock
Bodo Rosenhahn
186
1
0
08 May 2025
CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory
CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory
Weichen Zhang
Chen Gao
Shiquan Yu
Ruiying Peng
Baining Zhao
Qian Zhang
Jinqiang Cui
Xinlei Chen
Yongqian Li
LLMAGLM&Ro
151
0
0
08 May 2025
Learning to Drive Anywhere with Model-Based Reannotation
Learning to Drive Anywhere with Model-Based Reannotation
Noriaki Hirose
Lydia Ignatova
Kyle Stachowicz
Catherine Glossop
Sergey Levine
Dhruv Shah
85
1
0
08 May 2025
Joint Super-Resolution and Segmentation for 1-m Impervious Surface Area Mapping in China's Yangtze River Economic Belt
Joint Super-Resolution and Segmentation for 1-m Impervious Surface Area Mapping in China's Yangtze River Economic Belt
Jie Deng
Danfeng Hong
Chenyu Li
Naoto Yokoya
163
0
0
08 May 2025
InstanceGen: Image Generation with Instance-level Instructions
InstanceGen: Image Generation with Instance-level Instructions
Etai Sella
Yanir Kleiman
Hadar Averbuch-Elor
110
0
0
08 May 2025
MDE-Edit: Masked Dual-Editing for Multi-Object Image Editing via Diffusion Models
MDE-Edit: Masked Dual-Editing for Multi-Object Image Editing via Diffusion Models
Hongyang Zhu
Haipeng Liu
Bo Fu
Yang Wang
DiffM
137
0
0
08 May 2025
SOAP: Style-Omniscient Animatable Portraits
SOAP: Style-Omniscient Animatable Portraits
Tingting Liao
Yujian Zheng
Adilbek Karmanov
Liwen Hu
Leyang Jin
Yuliang Xiu
Hao Li
DiffM
504
0
0
08 May 2025
Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model
Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model
Navin Ranjan
Andreas E. Savakis
MQVLM
157
0
0
08 May 2025
FLAM: Frame-Wise Language-Audio Modeling
FLAM: Frame-Wise Language-Audio Modeling
Yusong Wu
Christos Tsirigotis
Ke Chen
Cheng-Zhi Anna Huang
Rameswar Panda
Oriol Nieto
Prem Seetharaman
Justin Salamon
93
1
0
08 May 2025
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
Junjie Wang
Bin Chen
Yulin Li
Bin Kang
Yulin Chen
Zhuotao Tian
VLM
108
1
0
07 May 2025
MAISY: Motion-Aware Image SYnthesis for Medical Image Motion Correction
MAISY: Motion-Aware Image SYnthesis for Medical Image Motion Correction
Andrew Zhang
Hao Wang
Shuchang Ye
M. Fulham
Jinman Kim
MedIm
121
0
0
07 May 2025
Show or Tell? A Benchmark To Evaluate Visual and Textual Prompts in Semantic Segmentation
Show or Tell? A Benchmark To Evaluate Visual and Textual Prompts in Semantic Segmentation
Gabriele Rosi
Fabio Cermelli
VLM
175
0
0
06 May 2025
Corner Cases: How Size and Position of Objects Challenge ImageNet-Trained Models
Corner Cases: How Size and Position of Objects Challenge ImageNet-Trained Models
Mishal Fatima
Steffen Jung
Margret Keuper
89
0
0
06 May 2025
CaRaFFusion: Improving 2D Semantic Segmentation with Camera-Radar Point Cloud Fusion and Zero-Shot Image Inpainting
CaRaFFusion: Improving 2D Semantic Segmentation with Camera-Radar Point Cloud Fusion and Zero-Shot Image Inpainting
Huawei Sun
Bora Kunter Sahin
Georg Stettinger
Maximilian Bernhard
Matthias Schubert
Robert Wille
157
0
0
06 May 2025
Importance Analysis for Dynamic Control of Balancing Parameter in a Simple Knowledge Distillation Setting
Importance Analysis for Dynamic Control of Balancing Parameter in a Simple Knowledge Distillation Setting
Seongmin Kim
Kwanho Kim
Minseung Kim
Kanghyun Jo
53
0
0
06 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Wei Wei
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
...
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
363
1
0
05 May 2025
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
Dengyang Jiang
Mengmeng Wang
Liuzhuozheng Li
Lei Zhang
Haoyu Wang
Wei Wei
Guang Dai
Yanning Zhang
Jingdong Wang
DiffM
130
0
0
05 May 2025
Segment Any RGB-Thermal Model with Language-aided Distillation
Segment Any RGB-Thermal Model with Language-aided Distillation
Dong Xing
Xianxun Zhu
Wei Zhou
Qika Lin
Hang Yang
Yuqing Wang
VLM
201
0
0
04 May 2025
RNBF: Real-Time RGB-D Based Neural Barrier Functions for Safe Robotic Navigation
RNBF: Real-Time RGB-D Based Neural Barrier Functions for Safe Robotic Navigation
Satyajeet Das
Yifan Xue
Haoming Li
Nadia Figueroa
136
0
0
04 May 2025
Prompt-responsive Object Retrieval with Memory-augmented Student-Teacher Learning
Prompt-responsive Object Retrieval with Memory-augmented Student-Teacher Learning
Malte Mosbach
Sven Behnke
93
0
0
04 May 2025
RESAnything: Attribute Prompting for Arbitrary Referring Segmentation
RESAnything: Attribute Prompting for Arbitrary Referring Segmentation
Ruiqi Wang
Hao Zhang
VLM
129
1
0
03 May 2025
ReLI: A Language-Agnostic Approach to Human-Robot Interaction
ReLI: A Language-Agnostic Approach to Human-Robot Interaction
Linus Nwankwo
Bjoern Ellensohn
Ozan Özdenizci
Elmar Rueckert
LM&Ro
256
0
0
03 May 2025
Can Foundation Models Really Segment Tumors? A Benchmarking Odyssey in Lung CT Imaging
Can Foundation Models Really Segment Tumors? A Benchmarking Odyssey in Lung CT Imaging
Elena Mulero Ayllón
Massimiliano Mantegna
Linlin Shen
Paolo Soda
V. Guarrasi
M. Tortora
87
0
0
02 May 2025
Previous
123456...262728
Next