ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.02643
  4. Cited By
Segment Anything

Segment Anything

5 April 2023
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
Laura Gustafson
Tete Xiao
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
    MLLMVLM
ArXiv (abs)PDFHTML

Papers citing "Segment Anything"

50 / 1,376 papers shown
Title
Do computer vision foundation models learn the low-level characteristics of the human visual system?
Do computer vision foundation models learn the low-level characteristics of the human visual system?
Yancheng Cai
Fei Yin
Dounia Hammou
Rafal Mantiuk
VLM
Presented at ResearchTrend Connect | VLM on 14 Mar 2025
239
3
0
13 Mar 2025
Bayesian Prompt Flow Learning for Zero-Shot Anomaly Detection
Bayesian Prompt Flow Learning for Zero-Shot Anomaly Detection
Zhen Qu
Xian Tao
Xinyi Gong
Shichen Qu
Qiyu Chen
Zhengtao Zhang
Xingang Wang
Guiguang Ding
VLM
186
1
0
13 Mar 2025
PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models
PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models
Runze He
Bo Cheng
Yuhang Ma
Qingxiang Jia
Shanyuan Liu
Ao Ma
Xiaoyu Wu
Liebucha Wu
Dawei Leng
Yuhui Yin
DiffMVLM
187
0
0
13 Mar 2025
GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding
GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding
R. Hu
Lianghui Zhu
Yuxuan Zhang
Tianheng Cheng
Lei Liu
Heng Liu
Longjin Ran
Xiaoxin Chen
Wenyu Liu
Xinggang Wang
ObjD
179
0
0
13 Mar 2025
Unveiling the Invisible: Reasoning Complex Occlusions Amodally with AURA
Unveiling the Invisible: Reasoning Complex Occlusions Amodally with AURA
Zhixuan Li
Hyunse Yoon
Sanghoon Lee
Weisi Lin
102
1
0
13 Mar 2025
Large-scale Pre-training for Grounded Video Caption Generation
Large-scale Pre-training for Grounded Video Caption Generation
Evangelos Kazakos
Cordelia Schmid
Josef Sivic
95
0
0
13 Mar 2025
NVP-HRI: Zero Shot Natural Voice and Posture-based Human-Robot Interaction via Large Language Model
Yuzhi Lai
Shenghai Yuan
Youssef Nassar
Mingyu Fan
T. Weber
Matthias Rätsch
LM&Ro
138
3
0
12 Mar 2025
Efficient Alignment of Unconditioned Action Prior for Language-conditioned Pick and Place in Clutter
Efficient Alignment of Unconditioned Action Prior for Language-conditioned Pick and Place in Clutter
Kechun Xu
Xunlong Xia
Kaixuan Wang
Yifei Yang
Yunxuan Mao
Bing Deng
R. Xiong
Yansen Wang
OffRL
193
0
0
12 Mar 2025
Polygonizing Roof Segments from High-Resolution Aerial Images Using Yolov8-Based Edge Detection
Qipeng Mei
Dimitri Bulatov
Dorota Iwaszczuk
173
0
0
12 Mar 2025
GarmentPile: Point-Level Visual Affordance Guided Retrieval and Adaptation for Cluttered Garments Manipulation
GarmentPile: Point-Level Visual Affordance Guided Retrieval and Adaptation for Cluttered Garments Manipulation
Ruihai Wu
Ziyu Zhu
Yuran Wang
Yue Chen
Jiarui Wang
Hao Dong
108
0
0
12 Mar 2025
VRMDiff: Text-Guided Video Referring Matting Generation of Diffusion
Lehan Yang
Jincen Song
Tianlong Wang
Daiqing Qi
Weili Shi
Yuheng Liu
Sheng Li
DiffMVOSVGen
150
0
0
11 Mar 2025
WildSeg3D: Segment Any 3D Objects in the Wild from 2D Images
WildSeg3D: Segment Any 3D Objects in the Wild from 2D Images
Yansong Guo
Jie Hu
Yansong Qu
Liujuan Cao
3DGS
487
1
0
11 Mar 2025
DexGrasp Anything: Towards Universal Robotic Dexterous Grasping with Physics Awareness
DexGrasp Anything: Towards Universal Robotic Dexterous Grasping with Physics Awareness
Yiming Zhong
Qi Jiang
Jingyi Yu
Yuexin Ma
207
5
0
11 Mar 2025
SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
Muzhi Zhu
Yuzhuo Tian
Hao Chen
Chunluan Zhou
Qingpei Guo
Yongxu Liu
M. Yang
Chunhua Shen
MLLMVLM
152
1
0
11 Mar 2025
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning
Xin Wen
Bingchen Zhao
Yilun Chen
Jiangmiao Pang
Xiaojuan Qi
LM&Ro
238
1
0
10 Mar 2025
RS2-SAM2: Customized SAM2 for Referring Remote Sensing Image Segmentation
RS2-SAM2: Customized SAM2 for Referring Remote Sensing Image Segmentation
Fu Rong
Meng Lan
Qian Zhang
Lefei Zhang
128
0
0
10 Mar 2025
A Review on Geometry and Surface Inspection in 3D Concrete Printing
K. Mawas
M. Maboudi
M. Gerke
97
0
0
10 Mar 2025
HumanMM: Global Human Motion Recovery from Multi-shot Videos
Yize Zhang
Guanlin Wu
Ling-Hao Chen
Zhuokai Zhao
Jing Lin
...
Jiamin Wu
Tianying Wang
Hao Frank Yang
Haoqian Wang
Lei Zhang
3DH
123
0
0
10 Mar 2025
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
Siyuan Mu
Sen Lin
MoE
510
10
0
10 Mar 2025
When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning
When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning
Junwei Luo
Yingying Zhang
Xiaoyu Yang
Kang Wu
Qi Zhu
Lei Liang
Jingdong Chen
Yansheng Li
177
2
0
10 Mar 2025
VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation
VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation
Hanzhi Chen
Boyang Sun
Anran Zhang
Marc Pollefeys
Stefan Leutenegger
LM&Ro
178
1
0
10 Mar 2025
Conceptrol: Concept Control of Zero-shot Personalized Image Generation
Qiyuan He
Angela Yao
DiffM
81
0
0
09 Mar 2025
MemorySAM: Memorize Modalities and Semantics with Segment Anything Model 2 for Multi-modal Semantic Segmentation
MemorySAM: Memorize Modalities and Semantics with Segment Anything Model 2 for Multi-modal Semantic Segmentation
Chenfei Liao
Xu Zheng
Yuanhuiyi Lyu
Haiwei Xue
Yihong Cao
Jiawen Wang
Kailun Yang
Xuming Hu
VLM
170
9
0
09 Mar 2025
Consistent Image Layout Editing with Diffusion Models
Tao Xia
Yudi Zhang
Ting Liu Lei Zhang
DiffM
144
1
0
09 Mar 2025
SAQ-SAM: Semantically-Aligned Quantization for Segment Anything Model
Jing Zhang
Zhiyu Li
Qingyi Gu
MQVLM
79
0
0
09 Mar 2025
From Dataset to Real-world: General 3D Object Detection via Generalized Cross-domain Few-shot Learning
Shuangzhi Li
Junlong Shen
Lei Ma
Xingyu Li
3DPC
113
0
0
08 Mar 2025
GAT-Grasp: Gesture-Driven Affordance Transfer for Task-Aware Robotic Grasping
Ruixiang Wang
Huayi Zhou
Xinyue Yao
Guiliang Liu
Kui Jia
117
0
0
08 Mar 2025
Do Fairness Interventions Come at the Cost of Privacy: Evaluations for Binary Classifiers
Huan Tian
Guangsheng Zhang
Bo Liu
Tianqing Zhu
Ming Ding
Wanlei Zhou
115
1
0
08 Mar 2025
Improving SAM for Camouflaged Object Detection via Dual Stream Adapters
Improving SAM for Camouflaged Object Detection via Dual Stream Adapters
Jiaming Liu
Linghe Kong
Guihai Chen
153
0
0
08 Mar 2025
OpenRSD: Towards Open-prompts for Object Detection in Remote Sensing Images
OpenRSD: Towards Open-prompts for Object Detection in Remote Sensing Images
Ziyue Huang
Yongchao Feng
Shuai Yang
Ziqiang Liu
Qingjie Liu
Yansen Wang
ObjD
467
2
0
08 Mar 2025
Stereo Any Video: Temporally Consistent Stereo Matching
Stereo Any Video: Temporally Consistent Stereo Matching
Junpeng Jing
Weixun Luo
Ye Mao
K. Mikolajczyk
98
0
0
07 Mar 2025
GaussianCAD: Robust Self-Supervised CAD Reconstruction from Three Orthographic Views Using 3D Gaussian Splatting
Zheng Zhou
Zhe Li
Bo Yu
Lina Hu
Liang Dong
...
Xiaoli Liu
N. Xu
Zehao Wang
Yonghao Dang
Jianqin Yin
3DGS3DV
73
1
0
07 Mar 2025
S4M: Segment Anything with 4 Extreme Points
A. Meyer
Lorenzo Arboit
Giuseppe Massimiani
Francesco Brucchi
Luca Emanuele Amodio
Didier Mutter
N. Padoy
77
0
0
07 Mar 2025
OPG-Policy: Occluded Push-Grasp Policy Learning with Amodal Segmentation
Hao Ding
Yiming Zeng
Zhaoliang Wan
Hui Cheng
97
1
0
06 Mar 2025
Enhancing SAM with Efficient Prompting and Preference Optimization for Semi-supervised Medical Image Segmentation
Aishik Konwer
Zhijian Yang
Erhan Bas
Cao Xiao
Prateek Prasanna
Parminder Bhatia
Taha A. Kass-Hout
MedImVLM
126
2
0
06 Mar 2025
Is Pre-training Applicable to the Decoder for Dense Prediction?
Is Pre-training Applicable to the Decoder for Dense Prediction?
Chao Ning
Wanshui Gan
Weihao Xuan
Naoto Yokoya
301
0
0
05 Mar 2025
Interactive Segmentation and Report Generation for CT Images
Yannian Gu
Wenhui Lei
Hanyu Chen
Xiaofan Zhang
Shanghang Zhang
105
0
0
05 Mar 2025
From Infants to AI: Incorporating Infant-like Learning in Models Boosts Efficiency and Generalization in Learning Social Prediction Tasks
From Infants to AI: Incorporating Infant-like Learning in Models Boosts Efficiency and Generalization in Learning Social Prediction Tasks
Shify Treger
Shimon Ullman
101
0
0
05 Mar 2025
AirExo-2: Scaling up Generalizable Robotic Imitation Learning with Low-Cost Exoskeletons
AirExo-2: Scaling up Generalizable Robotic Imitation Learning with Low-Cost Exoskeletons
Hongjie Fang
Chenxi Wang
Yiming Wang
J. Chen
Shangning Xia
...
Xinyu Zhan
Lixin Yang
Weiming Wang
Cewu Lu
Hao-Shu Fang
194
3
0
05 Mar 2025
Generative Artificial Intelligence in Robotic Manipulation: A Survey
Kun Zhang
Peng Yun
Jun Cen
Junhao Cai
DiDi Zhu
...
Qifeng Chen
Jia Pan
Wei Zhang
Bo Yang
Hua Chen
185
2
0
05 Mar 2025
AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model
AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model
Wenlun Zhang
Yunshan Zhong
Shimpei Ando
Kentaro Yoshioka
VLMMQ
137
0
0
05 Mar 2025
TopoMortar: A dataset to evaluate image segmentation methods focused on topology accuracy
Juan Miguel Valverde
Motoya Koga
Nijihiko Otsuka
Anders Bjorholm Dahl
90
0
0
05 Mar 2025
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation
Suhwan Cho
Seunghoon Lee
Minhyeok Lee
Jungho Lee
Sangyoun Lee
VOS
188
0
0
05 Mar 2025
SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning
SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning
Borong Zhang
Yuhao Zhang
Yalan Qin
Yingshan Lei
Josef Dai
Yuanpei Chen
Yaodong Yang
135
4
0
05 Mar 2025
WMNav: Integrating Vision-Language Models into World Models for Object Goal Navigation
WMNav: Integrating Vision-Language Models into World Models for Object Goal Navigation
Dujun Nie
Xianda Guo
Yiqun Duan
Ruijun Zhang
Long Chen
LM&Ro
382
5
0
04 Mar 2025
Out-of-Distribution Segmentation in Autonomous Driving: Problems and State of the Art
Out-of-Distribution Segmentation in Autonomous Driving: Problems and State of the Art
Youssef Shoeb
Azarm Nowzad
Hanno Gottschalk
UQCV
288
2
0
04 Mar 2025
Label-Efficient LiDAR Panoptic Segmentation
Label-Efficient LiDAR Panoptic Segmentation
Ahmet Selim Çanakçı
Niclas Vodisch
Kürsat Petek
Wolfram Burgard
Abhinav Valada
3DPC
204
0
0
04 Mar 2025
Boltzmann Attention Sampling for Image Analysis with Small Objects
Boltzmann Attention Sampling for Image Analysis with Small Objects
Theodore Zhao
Sid Kiblawi
Naoto Usuyama
Ho Hin Lee
Sam Preston
Hoifung Poon
Mu-Hsin Wei
MedIm
212
0
0
04 Mar 2025
FlowPlan: Zero-Shot Task Planning with LLM Flow Engineering for Robotic Instruction Following
Zijun Lin
Chao Tang
Hanjing Ye
Kuanqi Cai
119
0
0
04 Mar 2025
A Token-level Text Image Foundation Model for Document Understanding
A Token-level Text Image Foundation Model for Document Understanding
Tongkun Guan
Zining Wang
Pei Fu
Zhengtao Guo
Wei Shen
...
Chen Duan
Hao Sun
Qianyi Jiang
Junfeng Luo
Xiaokang Yang
VLM
196
2
0
04 Mar 2025
Previous
123...8910...262728
Next