ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.00714
  4. Cited By
SAM 2: Segment Anything in Images and Videos

SAM 2: Segment Anything in Images and Videos

1 August 2024
Nikhila Ravi
Valentin Gabeur
Yuan-Ting Hu
Ronghang Hu
Chaitanya K. Ryali
Tengyu Ma
Haitham Khedr
Roman Rädle
Chloe Rolland
Laura Gustafson
Eric Mintun
Junting Pan
Kalyan Vasudev Alwala
Nicolas Carion
Chao-Yuan Wu
Ross B. Girshick
Piotr Dollár
Christoph Feichtenhofer
    VLM
    MLLM
ArXivPDFHTML

Papers citing "SAM 2: Segment Anything in Images and Videos"

39 / 189 papers shown
Title
Next Best Sense: Guiding Vision and Touch with FisherRF for 3D Gaussian Splatting
Next Best Sense: Guiding Vision and Touch with FisherRF for 3D Gaussian Splatting
Matthew Strong
Boshu Lei
Aiden Swann
Wen Jiang
Kostas Daniilidis
Monroe Kennedy III
3DGS
48
3
0
07 Oct 2024
ECHOPulse: ECG controlled echocardio-grams video generation
ECHOPulse: ECG controlled echocardio-grams video generation
Yiwei Li
Sekeun Kim
Zihao Wu
Hanqi Jiang
Yi Pan
...
Sifan Song
Yucheng Shi
Tianming Liu
Quanzheng Li
Xiang Li
VGen
29
1
0
04 Oct 2024
DiffKillR: Killing and Recreating Diffeomorphisms for Cell Annotation in Dense Microscopy Images
DiffKillR: Killing and Recreating Diffeomorphisms for Cell Annotation in Dense Microscopy Images
Chen Liu
Danqi Liao
Alejandro Parada-Mayorga
Alejandro Ribeiro
Marcello DiStasio
Smita Krishnaswamy
50
4
0
04 Oct 2024
SinkSAM: A Monocular Depth-Guided SAM Framework for Automatic Sinkhole
  Segmentation
SinkSAM: A Monocular Depth-Guided SAM Framework for Automatic Sinkhole Segmentation
Osher Rafaeli
T. Svoray
Ariel Nahlieli
33
0
0
02 Oct 2024
Learning Physics From Video: Unsupervised Physical Parameter Estimation for Continuous Dynamical Systems
Learning Physics From Video: Unsupervised Physical Parameter Estimation for Continuous Dynamical Systems
Alejandro Castañeda Garcia
Jan van Gemert
Daan Brinks
Nergis Tömen
38
0
0
02 Oct 2024
iTeach: Interactive Teaching for Robot Perception using Mixed Reality
iTeach: Interactive Teaching for Robot Perception using Mixed Reality
Jishnu Jaykumar P
Cole Salvato
Vinaya Bomnale
Jikai Wang
Yu Xiang
49
0
0
01 Oct 2024
When SAM2 Meets Video Camouflaged Object Segmentation: A Comprehensive Evaluation and Adaptation
When SAM2 Meets Video Camouflaged Object Segmentation: A Comprehensive Evaluation and Adaptation
Yuli Zhou
Guolei Sun
Yawei Li
Guo-Sen Xie
Luca Benini
Ender Konukoglu
30
4
0
27 Sep 2024
MIMO: Controllable Character Video Synthesis with Spatial Decomposed
  Modeling
MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling
Yifang Men
Yuan Yao
Miaomiao Cui
Liefeng Bo
DiffM
29
18
0
24 Sep 2024
Articulated Object Manipulation using Online Axis Estimation with SAM2-Based Tracking
Articulated Object Manipulation using Online Axis Estimation with SAM2-Based Tracking
Xi Wang
Tianxing Chen
Qiaojun Yu
Tianling Xu
Zanxin Chen
Yiting Fu
Cewu Lu
Cewu Lu
Ping Luo
Ping Luo
51
4
0
24 Sep 2024
DROP: Dexterous Reorientation via Online Planning
DROP: Dexterous Reorientation via Online Planning
Albert H. Li
Preston Culbertson
Vince Kurtz
Aaron D. Ames
54
7
0
22 Sep 2024
PointSAM: Pointly-Supervised Segment Anything Model for Remote Sensing Images
PointSAM: Pointly-Supervised Segment Anything Model for Remote Sensing Images
Nanqing Liu
Xun Xu
Yongyi Su
Haojie Zhang
Heng-Chao Li
VLM
43
14
0
20 Sep 2024
GroundingBooth: Grounding Text-to-Image Customization
GroundingBooth: Grounding Text-to-Image Customization
Zhexiao Xiong
Wei Xiong
Jing Shi
He Zhang
Yizhi Song
Nathan Jacobs
DiffM
62
6
0
13 Sep 2024
LSVOS Challenge 3rd Place Report: SAM2 and Cutie based VOS
LSVOS Challenge 3rd Place Report: SAM2 and Cutie based VOS
Xinyu Liu
Jing Zhang
Kexin Zhang
Xu Liu
Lingling Li
28
1
0
20 Aug 2024
Retrieval-augmented Few-shot Medical Image Segmentation with Foundation Models
Retrieval-augmented Few-shot Medical Image Segmentation with Foundation Models
Lin Zhao
Xiao Chen
Eric Z. Chen
Yikang Liu
Terrence Chen
Shanhui Sun
VLM
54
5
0
16 Aug 2024
Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning
Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning
Haofeng Liu
Erli Zhang
Junde Wu
Mingxuan Hong
Yueming Jin
MedIm
53
14
0
15 Aug 2024
Prompt-Based Segmentation at Multiple Resolutions and Lighting Conditions using Segment Anything Model 2
Prompt-Based Segmentation at Multiple Resolutions and Lighting Conditions using Segment Anything Model 2
Osher Rafaeli
T. Svoray
Roni Blushtein-Livnon
Ariel Nahlieli
VLM
58
10
0
13 Aug 2024
Zero-Shot Surgical Tool Segmentation in Monocular Video Using Segment
  Anything Model 2
Zero-Shot Surgical Tool Segmentation in Monocular Video Using Segment Anything Model 2
Ange Lou
Yamin Li
Ji-Eun Han
Wonjin Yang
Zhi-Qi Cheng
VLM
32
8
0
03 Aug 2024
SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge
SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge
Hao Ding
Tuxun Lu
Yuqian Zhang
Ruixing Liang
Hongchao Shu
...
Bo Wang
Marcos Fernández-Rodríguez
Estevao Lima
João L. Vilaça
Mathias Unberath
63
4
0
16 Jul 2024
Affordance-Guided Reinforcement Learning via Visual Prompting
Affordance-Guided Reinforcement Learning via Visual Prompting
Olivia Y. Lee
Annie Xie
Kuan Fang
Karl Pertsch
Chelsea Finn
OffRL
LM&Ro
76
7
0
14 Jul 2024
Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images
Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images
Kazi Sajeed Mehrab
M. Maruf
Arka Daw
Harish Babu Manogaran
Abhilash Neog
...
Paula Mabee
Wasila Dahdul
Anuj Karpatne
Wasila M Dahdul
Anuj Karpatne
41
4
0
10 Jul 2024
Learning Spatial-Semantic Features for Robust Video Object Segmentation
Learning Spatial-Semantic Features for Robust Video Object Segmentation
Xin Li
Deshui Miao
Zhenyu He
Yansen Wang
Huchuan Lu
Ming Yang
VOS
59
4
0
10 Jul 2024
EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model
EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model
Yuxuan Zhang
Tianheng Cheng
Lianghui Zhu
Lei Liu
Heng Liu
Longjin Ran
Xiaoxin Chen
Xiaoxin Chen
Wenyu Liu
Xinggang Wang
VLM
61
25
0
28 Jun 2024
DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning
DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning
Zeyu Gao
Yao Mu
Jinye Qu
Mengkang Hu
Lingyue Guo
Ping Luo
Yanfeng Lu
Ping Luo
Shanghang Zhang
Yanfeng Lu
54
10
0
14 Jun 2024
HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction
HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction
Jikai Wang
Qifan Zhang
Yu-Wei Chao
Bowen Wen
Xiaohu Guo
Yu Xiang
3DH
53
2
0
10 Jun 2024
CromSS: Cross-modal pre-training with noisy labels for remote sensing image segmentation
CromSS: Cross-modal pre-training with noisy labels for remote sensing image segmentation
Chenying Liu
C. Albrecht
Yi Wang
Xiao Xiang Zhu
65
2
0
02 May 2024
Moving Object Segmentation: All You Need Is SAM (and Flow)
Moving Object Segmentation: All You Need Is SAM (and Flow)
Junyu Xie
Charig Yang
Weidi Xie
Andrew Zisserman
43
11
0
18 Apr 2024
Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V
Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V
Peiyuan Zhi
Zhiyuan Zhang
Muzhi Han
Zeyu Zhang
Zhitian Li
Ziyuan Jiao
Ziyuan Jiao
Siyuan Huang
Siyuan Huang
LRM
LM&Ro
49
30
0
16 Apr 2024
Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models
Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models
Yutao Ouyang
Jinhan Li
Yunfei Li
Zhongyu Li
Chao Yu
K. Sreenath
Yi Wu
54
15
0
08 Apr 2024
Renovating Names in Open-Vocabulary Segmentation Benchmarks
Renovating Names in Open-Vocabulary Segmentation Benchmarks
Haiwen Huang
Songyou Peng
Dan Zhang
Andreas Geiger
VLM
37
3
0
14 Mar 2024
Promoting Segment Anything Model towards Highly Accurate Dichotomous Image Segmentation
Promoting Segment Anything Model towards Highly Accurate Dichotomous Image Segmentation
Xianjie Liu
Keren Fu
Qijun Zhao
Qijun Zhao
VLM
32
1
0
30 Dec 2023
Audio-Visual Instance Segmentation
Audio-Visual Instance Segmentation
Ruohao Guo
Yaru Chen
Yanyu Qi
Wenzhen Yue
Dantong Niu
...
Wenzhen Yue
Ji Shi
Qixun Wang
Peiliang Zhang
Buwen Liang
VLM
VOS
34
2
0
28 Oct 2023
EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations
EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations
Ahmad Darkhalil
Dandan Shan
Bin Zhu
Jian Ma
Amlan Kar
Richard E. L. Higgins
Sanja Fidler
David Fouhey
Dima Damen
VOS
50
98
0
26 Sep 2022
SOCRATES: A Stereo Camera Trap for Monitoring of Biodiversity
SOCRATES: A Stereo Camera Trap for Monitoring of Biodiversity
T. Haucke
H. Kühl
Volker Steinhage
39
11
0
19 Sep 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
311
7,457
0
11 Nov 2021
Carbon Emissions and Large Neural Network Training
Carbon Emissions and Large Neural Network Training
David A. Patterson
Joseph E. Gonzalez
Quoc V. Le
Chen Liang
Lluís-Miquel Munguía
D. Rothchild
David R. So
Maud Texier
J. Dean
AI4CE
253
645
0
21 Apr 2021
TrackFormer: Multi-Object Tracking with Transformers
TrackFormer: Multi-Object Tracking with Transformers
Tim Meinhardt
A. Kirillov
Laura Leal-Taixe
Christoph Feichtenhofer
VOT
232
743
0
07 Jan 2021
Learning Fast and Robust Target Models for Video Object Segmentation
Learning Fast and Robust Target Models for Video Object Segmentation
Andreas Robinson
Felix Järemo Lawin
Martin Danelljan
Fahad Shahbaz Khan
M. Felsberg
VOS
51
140
0
27 Feb 2020
Feature Pyramid Networks for Object Detection
Feature Pyramid Networks for Object Detection
Nayeon Lee
Piotr Dollár
Ross B. Girshick
Kaiming He
Bharath Hariharan
Serge J. Belongie
ObjD
186
21,819
0
09 Dec 2016
Semantic Understanding of Scenes through the ADE20K Dataset
Semantic Understanding of Scenes through the ADE20K Dataset
Bolei Zhou
Hang Zhao
Xavier Puig
Tete Xiao
Sanja Fidler
Adela Barriuso
Antonio Torralba
SSeg
253
1,829
0
18 Aug 2016
Previous
1234