ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.02643
  4. Cited By
Segment Anything

Segment Anything

5 April 2023
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
Laura Gustafson
Tete Xiao
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
    MLLM
    VLM
ArXivPDFHTML

Papers citing "Segment Anything"

50 / 4,194 papers shown
Title
Adaptive Contextual Embedding for Robust Far-View Borehole Detection
Adaptive Contextual Embedding for Robust Far-View Borehole Detection
Xuesong Liu
Tianyu Hao
Emmett J. Ientilucci
46
0
0
08 May 2025
FLAM: Frame-Wise Language-Audio Modeling
FLAM: Frame-Wise Language-Audio Modeling
Yusong Wu
Christos Tsirigotis
Ke Chen
Cheng-Zhi Anna Huang
Rameswar Panda
Oriol Nieto
Prem Seetharaman
Justin Salamon
55
0
0
08 May 2025
Pro2SAM: Mask Prompt to SAM with Grid Points for Weakly Supervised Object Localization
Pro2SAM: Mask Prompt to SAM with Grid Points for Weakly Supervised Object Localization
Xi Yang
Songsong Duan
Nannan Wang
Xinbo Gao
WSOL
78
0
0
08 May 2025
MDE-Edit: Masked Dual-Editing for Multi-Object Image Editing via Diffusion Models
MDE-Edit: Masked Dual-Editing for Multi-Object Image Editing via Diffusion Models
Hongyang Zhu
Haipeng Liu
Bo Fu
Yang Wang
DiffM
47
0
0
08 May 2025
Learning to Drive Anywhere with Model-Based Reannotation
Learning to Drive Anywhere with Model-Based Reannotation
Noriaki Hirose
Lydia Ignatova
Kyle Stachowicz
Catherine Glossop
Sergey Levine
Dhruv Shah
31
0
0
08 May 2025
InstanceGen: Image Generation with Instance-level Instructions
InstanceGen: Image Generation with Instance-level Instructions
Etai Sella
Yanir Kleiman
Hadar Averbuch-Elor
36
0
0
08 May 2025
Joint Super-Resolution and Segmentation for 1-m Impervious Surface Area Mapping in China's Yangtze River Economic Belt
Joint Super-Resolution and Segmentation for 1-m Impervious Surface Area Mapping in China's Yangtze River Economic Belt
Jie Deng
Danfeng Hong
Chenyu Li
Naoto Yokoya
42
0
0
08 May 2025
SVAD: From Single Image to 3D Avatar via Synthetic Data Generation with Video Diffusion and Data Augmentation
SVAD: From Single Image to 3D Avatar via Synthetic Data Generation with Video Diffusion and Data Augmentation
Yonwoo Choi
3DGS
VGen
72
0
0
08 May 2025
CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory
CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory
Weichen Zhang
Chen Gao
Shiquan Yu
Ruiying Peng
Baining Zhao
Qian Zhang
Jinqiang Cui
Xinlei Chen
Yong Li
LLMAG
LM&Ro
49
0
0
08 May 2025
SOAP: Style-Omniscient Animatable Portraits
SOAP: Style-Omniscient Animatable Portraits
Tingting Liao
Yujian Zheng
Adilbek Karmanov
Liwen Hu
Leyang Jin
Yuliang Xiu
Hao Li
DiffM
241
0
0
08 May 2025
CottonSim: Development of an autonomous visual-guided robotic cotton-picking system in the Gazebo
CottonSim: Development of an autonomous visual-guided robotic cotton-picking system in the Gazebo
Thevathayarajh Thayananthan
Xin Zhang
Yanbo Huang
Jianfei Chen
Nuwan K. Wijewardane
Vitor S. Martins
Gary D. Chesser
C. Goodin
53
0
0
08 May 2025
UncertainSAM: Fast and Efficient Uncertainty Quantification of the Segment Anything Model
UncertainSAM: Fast and Efficient Uncertainty Quantification of the Segment Anything Model
Timo Kaiser
Thomas Norrenbrock
Bodo Rosenhahn
53
0
0
08 May 2025
Visual Affordances: Enabling Robots to Understand Object Functionality
Visual Affordances: Enabling Robots to Understand Object Functionality
Tommaso Apicella
Alessio Xompero
Andrea Cavallaro
46
0
0
08 May 2025
Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model
Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model
Navin Ranjan
Andreas E. Savakis
MQ
VLM
68
0
0
08 May 2025
One2Any: One-Reference 6D Pose Estimation for Any Object
One2Any: One-Reference 6D Pose Estimation for Any Object
Mengya Liu
Siyuan Li
Ajad Chhatkuli
Prune Truong
Luc Van Gool
Federico Tombari
44
0
0
07 May 2025
Balancing Accuracy, Calibration, and Efficiency in Active Learning with Vision Transformers Under Label Noise
Balancing Accuracy, Calibration, and Efficiency in Active Learning with Vision Transformers Under Label Noise
Moseli Motsóehli
Hope Mogale
Kyungim Baek
43
0
0
07 May 2025
Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D Generation
Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D Generation
Yiming Qin
Zhu Xu
Yang Liu
31
0
0
07 May 2025
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
Junjie Wang
Bin Chen
Yulin Li
Bin Kang
Yulin Chen
Zhuotao Tian
VLM
40
0
0
07 May 2025
GSsplat: Generalizable Semantic Gaussian Splatting for Novel-view Synthesis in 3D Scenes
GSsplat: Generalizable Semantic Gaussian Splatting for Novel-view Synthesis in 3D Scenes
Feng Xiao
Hongbin Xu
Wanlin Liang
Wenxiong Kang
3DGS
51
0
0
07 May 2025
MAISY: Motion-Aware Image SYnthesis for Medical Image Motion Correction
MAISY: Motion-Aware Image SYnthesis for Medical Image Motion Correction
Andrew Zhang
Hao Wang
Shuchang Ye
M. Fulham
Jinman Kim
MedIm
35
0
0
07 May 2025
Show or Tell? A Benchmark To Evaluate Visual and Textual Prompts in Semantic Segmentation
Show or Tell? A Benchmark To Evaluate Visual and Textual Prompts in Semantic Segmentation
Gabriele Rosi
Fabio Cermelli
VLM
42
0
0
06 May 2025
Importance Analysis for Dynamic Control of Balancing Parameter in a Simple Knowledge Distillation Setting
Importance Analysis for Dynamic Control of Balancing Parameter in a Simple Knowledge Distillation Setting
Seongmin Kim
Kwanho Kim
Minseung Kim
Kanghyun Jo
26
0
0
06 May 2025
CaRaFFusion: Improving 2D Semantic Segmentation with Camera-Radar Point Cloud Fusion and Zero-Shot Image Inpainting
CaRaFFusion: Improving 2D Semantic Segmentation with Camera-Radar Point Cloud Fusion and Zero-Shot Image Inpainting
Huawei Sun
Bora Kunter Sahin
Georg Stettinger
Maximilian Bernhard
Matthias Schubert
Robert Wille
56
0
0
06 May 2025
Corner Cases: How Size and Position of Objects Challenge ImageNet-Trained Models
Corner Cases: How Size and Position of Objects Challenge ImageNet-Trained Models
Mishal Fatima
Steffen Jung
M. Keuper
45
0
0
06 May 2025
Reinforced Correlation Between Vision and Language for Precise Medical AI Assistant
Reinforced Correlation Between Vision and Language for Precise Medical AI Assistant
Haonan Wang
Jiaji Mao
Lehan Wang
Qixiang Zhang
Marawan Elbatel
...
Weifeng Qin
Yiming Li
Jialin Liang
Jun Shen
Xiaomeng Li
MedIm
38
0
0
06 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Jiahui Geng
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
74
0
0
05 May 2025
LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery
LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery
Jerome Quenum
Wen-Han Hsieh
Tsung-Han Wu
Ritwik Gupta
Trevor Darrell
David M. Chan
MLLM
VLM
56
0
0
05 May 2025
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
Dengyang Jiang
Mengmeng Wang
Liuzhuozheng Li
Lei Zhang
Haoyu Wang
Wei Wei
Guang Dai
Yanning Zhang
Jingdong Wang
DiffM
51
0
0
05 May 2025
Sim2Real in endoscopy segmentation with a novel structure aware image translation
Sim2Real in endoscopy segmentation with a novel structure aware image translation
Clara Tomasini
L. Riazuelo
Ana C. Murillo
MedIm
39
0
0
05 May 2025
Advancing Generalizable Tumor Segmentation with Anomaly-Aware Open-Vocabulary Attention Maps and Frozen Foundation Diffusion Models
Advancing Generalizable Tumor Segmentation with Anomaly-Aware Open-Vocabulary Attention Maps and Frozen Foundation Diffusion Models
Yankai Jiang
Peng Zhang
Ke Wang
Yuan Tian
Hai Lin
Xinyu Wang
MedIm
208
0
0
05 May 2025
Segment Any RGB-Thermal Model with Language-aided Distillation
Segment Any RGB-Thermal Model with Language-aided Distillation
Dong Xing
Xianxun Zhu
Wei Zhou
Qika Lin
Hang Yang
Yuqing Wang
VLM
64
0
0
04 May 2025
RNBF: Real-Time RGB-D Based Neural Barrier Functions for Safe Robotic Navigation
RNBF: Real-Time RGB-D Based Neural Barrier Functions for Safe Robotic Navigation
Satyajeet Das
Yifan Xue
Haoming Li
Nadia Figueroa
35
0
0
04 May 2025
Prompt-responsive Object Retrieval with Memory-augmented Student-Teacher Learning
Prompt-responsive Object Retrieval with Memory-augmented Student-Teacher Learning
Malte Mosbach
Sven Behnke
36
0
0
04 May 2025
Seeing Heat with Color -- RGB-Only Wildfire Temperature Inference from SAM-Guided Multimodal Distillation using Radiometric Ground Truth
Seeing Heat with Color -- RGB-Only Wildfire Temperature Inference from SAM-Guided Multimodal Distillation using Radiometric Ground Truth
Michael Marinaccio
Fatemeh Afghah
42
0
0
03 May 2025
RESAnything: Attribute Prompting for Arbitrary Referring Segmentation
RESAnything: Attribute Prompting for Arbitrary Referring Segmentation
Ruiqi Wang
Hao Zhang
VLM
70
0
0
03 May 2025
ReLI: A Language-Agnostic Approach to Human-Robot Interaction
ReLI: A Language-Agnostic Approach to Human-Robot Interaction
Linus Nwankwo
Bjoern Ellensohn
Ozan Özdenizci
Elmar Rueckert
LM&Ro
71
0
0
03 May 2025
Accelerating Volumetric Medical Image Annotation via Short-Long Memory SAM 2
Accelerating Volumetric Medical Image Annotation via Short-Long Memory SAM 2
Yuwen Chen
Zafer Yildiz
Qihang Li
Yaqian Chen
Haoyu Dong
Hanxue Gu
Nicholas Konz
Maciej A. Mazurowski
MedIm
VLM
44
0
0
03 May 2025
MVHumanNet++: A Large-scale Dataset of Multi-view Daily Dressing Human Captures with Richer Annotations for 3D Human Digitization
MVHumanNet++: A Large-scale Dataset of Multi-view Daily Dressing Human Captures with Richer Annotations for 3D Human Digitization
Chenghong Li
Hongjie Liao
Yihao Zhi
Xihe Yang
Zhengwentai Sun
Jiahao Chang
Shuguang Cui
Xiaoguang Han
3DH
63
0
0
03 May 2025
Can Foundation Models Really Segment Tumors? A Benchmarking Odyssey in Lung CT Imaging
Can Foundation Models Really Segment Tumors? A Benchmarking Odyssey in Lung CT Imaging
Elena Mulero Ayllón
Massimiliano Mantegna
Linlin Shen
Paolo Soda
V. Guarrasi
M. Tortora
54
0
0
02 May 2025
Rethinking RGB-Event Semantic Segmentation with a Novel Bidirectional Motion-enhanced Event Representation
Rethinking RGB-Event Semantic Segmentation with a Novel Bidirectional Motion-enhanced Event Representation
Zhen Yao
Xiaowen Ying
Mooi Choo Choo Chuah
47
0
0
02 May 2025
Grounding Task Assistance with Multimodal Cues from a Single Demonstration
Grounding Task Assistance with Multimodal Cues from a Single Demonstration
Gabriel Sarch
Balasaravanan Thoravi Kumaravel
Sahithya Ravi
Vibhav Vineet
A. D. Wilson
233
0
0
02 May 2025
ViSA-Flow: Accelerating Robot Skill Learning via Large-Scale Video Semantic Action Flow
ViSA-Flow: Accelerating Robot Skill Learning via Large-Scale Video Semantic Action Flow
Changhe Chen
Quantao Yang
Xiaohao Xu
Nima Fazeli
Olov Andersson
31
0
0
02 May 2025
Brain Foundation Models with Hypergraph Dynamic Adapter for Brain Disease Analysis
Brain Foundation Models with Hypergraph Dynamic Adapter for Brain Disease Analysis
Zhongying Deng
Haoyu Wang
Ziyan Huang
Li Zhang
Angelica I Aviles-Rivero
Chaoyu Liu
Junjun He
Zoe Kourtzi
Carola-Bibiane Schönlieb
MedIm
AI4CE
42
0
0
01 May 2025
Cues3D: Unleashing the Power of Sole NeRF for Consistent and Unique Instances in Open-Vocabulary 3D Panoptic Segmentation
Cues3D: Unleashing the Power of Sole NeRF for Consistent and Unique Instances in Open-Vocabulary 3D Panoptic Segmentation
Feng Xue
Wenzhuang Xu
Guofeng Zhong
Anlong Minga
N. Sebe
65
0
0
01 May 2025
SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models
SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models
Wufei Ma
Luoxin Ye
Nessa McWeeney
Celso M de Melo
A. Yuille
Jieneng Chen
LRM
65
1
0
01 May 2025
Visual Test-time Scaling for GUI Agent Grounding
Visual Test-time Scaling for GUI Agent Grounding
Tiange Luo
Lajanugen Logeswaran
Justin Johnson
Honglak Lee
51
0
0
01 May 2025
InstructAttribute: Fine-grained Object Attributes editing with Instruction
InstructAttribute: Fine-grained Object Attributes editing with Instruction
Xingxi Yin
Jingfeng Zhang
Zhi Li
You Li
Wenjie Qu
DiffM
237
0
0
01 May 2025
A Survey on 3D Reconstruction Techniques in Plant Phenotyping: From Classical Methods to Neural Radiance Fields (NeRF), 3D Gaussian Splatting (3DGS), and Beyond
A Survey on 3D Reconstruction Techniques in Plant Phenotyping: From Classical Methods to Neural Radiance Fields (NeRF), 3D Gaussian Splatting (3DGS), and Beyond
Jiyang Li
Xinda Qi
Seyed Hamidreza Nabaei
M. Liu
Dong Chen
Xin Zhang
Xunyuan Yin
Zehan Li
56
0
0
30 Apr 2025
Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models
Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models
Guanghao Zhou
Panjia Qiu
Chong Chen
Rongxiang Weng
Zheming Yang
Jian Xu
Minghui Qiu
OffRL
LRM
58
1
0
30 Apr 2025
RoboGround: Robotic Manipulation with Grounded Vision-Language Priors
RoboGround: Robotic Manipulation with Grounded Vision-Language Priors
Haifeng Huang
Xinyi Chen
Yuxiao Chen
Yiming Li
Xiaoshen Han
Zihao Wang
Tai Wang
Jiangmiao Pang
Zhou Zhao
LM&Ro
80
0
0
30 Apr 2025
Previous
12345...828384
Next