YouTube-VOS: Sequence-to-Sequence Video Object Segmentation

3 September 2018

Yuchen Fan

Thomas Huang

Papers citing "YouTube-VOS: Sequence-to-Sequence Video Object Segmentation"

50 / 238 papers shown

Title
StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams Zike Wu Qi Yan Xuanyu Yi Lele Wang Renjie Liao 3DGS 28 0 0 10 Jun 2025
StableMTL: Repurposing Latent Diffusion Models for Multi-Task Learning from Partially Annotated Synthetic Datasets Anh-Quan Cao Ivan Lopes Raoul de Charette 30 0 0 09 Jun 2025
SAM-I2V: Upgrading SAM to Support Promptable Video Segmentation with Less than 0.2% Training Cost Haiyang Mei Pengyu Zhang Mike Zheng Shou VLM 51 0 0 02 Jun 2025
Reasoning Segmentation for Images and Videos: A Survey Yiqing Shen Chenjia Li Fei Xiong Jeong-O Jeong Tianpeng Wang Michael Latman Mathias Unberath VOS 252 0 0 24 May 2025
Enhancing Self-Supervised Fine-Grained Video Object Tracking with Dynamic Memory Prediction Zihan Zhou Changrui Dai Aibo Song Xiaolin Fang VOS 94 0 0 30 Apr 2025
RGB-D Video Object Segmentation via Enhanced Multi-store Feature Memory Boyue Xu Ruichao Hou Tongwei Ren Gangshan Wu VOS 157 1 0 23 Apr 2025
PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild Henghui Ding Chang Liu Nikhila Ravi Shuting He Y. Wei ... Haobo Yuan Xuelong Li Tao Zhang Lu Qi Ming-Hsuan Yang 92 1 0 15 Apr 2025
Saliency-Motion Guided Trunk-Collateral Network for Unsupervised Video Object Segmentation Xiangyu Zheng Wanyun Li Songcheng He Jianping Fan Xiaoqiang Li We Zhang VOS 93 0 0 08 Apr 2025
Studying Image Diffusion Features for Zero-Shot Video Object Segmentation Thanos Delatolas Vicky S. Kalogeiton Dim P. Papadopoulos DiffM VOS 130 2 0 07 Apr 2025
LV-MAE: Learning Long Video Representations through Masked-Embedding Autoencoders Ilan Naiman Emanuel Ben-Baruch Oron Anschel Alon Shoshan Igor Kviatkovsky Manoj Aggarwal Gérard Medioni 89 0 0 04 Apr 2025
Zero-Shot 4D Lidar Panoptic Segmentation Yushan Zhang Aljosa Osep Laura Leal-Taixé Tim Meinhardt 3DPC 98 1 0 01 Apr 2025
AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos Felix Wimbauer Weirong Chen Dominik Muhle Christian Rupprecht Daniel Cremers VGen 166 0 0 30 Mar 2025
Scoring, Remember, and Reference: Catching Camouflaged Objects in Videos Yuang Feng Shuyong Gao Fuzhen Yan Yicheng Song Lingyi Hong J. Hu Wenqiang Zhang VOS 85 0 0 21 Mar 2025
SAM2 for Image and Video Segmentation: A Comprehensive Survey Zhang Jiaxing Tang Hao VLM 112 0 0 17 Mar 2025
Leveraging Motion Information for Better Self-Supervised Video Correspondence Learning Zihan Zhoua Changrui Daia Aibo Songa Xiaolin Fang VOS 154 0 0 15 Mar 2025
SPOC: Spatially-Progressing Object State Change Segmentation in Video Priyanka Mandikal Tushar Nagarajan Alex Stoken Zihui Xue Kristen Grauman 79 0 0 15 Mar 2025
VideoPainter: Any-length Video Inpainting and Editing with Plug-and-Play Context Control Hao Wang Zhaoyang Zhang Xuan Ju Mingdeng Cao Liangbin Xie Ying Shan Qiang Xu VGen DiffM 104 1 0 07 Mar 2025
Object-Aware Video Matting with Cross-Frame Guidance Han Zhang Dongyue Wu Yuanjie Shao Nong Sang Changxin Gao VOS 114 0 0 03 Mar 2025
SMITE: Segment Me In TimE Amirhossein Alimohammadi Sauradip Nag Saeid Asgari Taghanaki Andrea Tagliasacchi Ghassan Hamarneh Ali Mahdavi-Amiri VLM VOS 539 3 0 20 Feb 2025
VipDiff: Towards Coherent and Diverse Video Inpainting via Training-free Denoising Diffusion Models Chaohao Xie Kai Han Kwan-Yee K. Wong VGen DiffM 470 0 0 21 Jan 2025
Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation Yunzhi Zhuge Hongyu Gu Lu Zhang Jinqing Qi Huchuan Lu VOS 244 3 0 14 Jan 2025
EdgeTAM: On-Device Track Anything Model Chong Zhou Chenchen Zhu Yunyang Xiong Saksham Suri Fanyi Xiao ... Raghuraman Krishnamoorthi Bo Dai Chen Change Loy Vikas Chandra Bilge Soran VLM 108 1 0 13 Jan 2025
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level Andong Deng Tongjia Chen Shoubin Yu Taojiannan Yang Lincoln Spencer Yapeng Tian Ajmal Mian Joey Tianyi Zhou Chen Chen LRM 111 3 0 15 Nov 2024
ReferEverything: Towards Segmenting Everything We Can Speak of in Videos Anurag Bagchi Zhipeng Bao Yu-Xiong Wang P. Tokmakov Martial Hebert VOS 73 2 0 30 Oct 2024
BIFRÖST: 3D-Aware Image compositing with Language Instructions Lingxiao Li Kaixiong Gong Weihong Li Xili Dai Tao Chen Xiaojun Yuan Xiangyu Yue 103 2 0 24 Oct 2024
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond Shanshan Han 173 1 0 09 Oct 2024
Memory Matching is not Enough: Jointly Improving Memory Matching and Decoding for Video Object Segmentation Jintu Zheng Yun Liang Yuqing Zhang Wanchao Su VOS 76 0 0 22 Sep 2024
LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation Henghui Ding Lingyi Hong Chang Liu Ning Xu L. Yang ... Bin Cao Yisi Zhang Hanyi Wang Xingjian He Jing Liu VOS 94 2 0 09 Sep 2024
Thinking Outside the BBox: Unconstrained Generative Object Compositing Gemma Canet Tarrés Zhe Lin Zhifei Zhang Jianming Zhang Yizhi Song Dan Ruta Andrew Gilbert John Collomosse Soo Ye Kim DiffM 91 10 0 06 Sep 2024
Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation Shaofei Huang Rui Ling Hongyu Li Tianrui Hui Zongheng Tang Xiaoming Wei Jizhong Han Si Liu VOS 90 8 0 28 Aug 2024
CHOTA: A Higher Order Accuracy Metric for Cell Tracking Timo Kaiser Vladimir Ulman Bodo Rosenhahn 84 3 0 21 Aug 2024
Video Diffusion Models are Strong Video Inpainter Minhyeok Lee Suhwan Cho Chajin Shin Jungho Lee Sunghun Yang Sangyoun Lee VGen DiffM 76 9 0 21 Aug 2024
Contextual Cross-Modal Attention for Audio-Visual Deepfake Detection and Localization Vinaya Sree Katamneni A. Rattani 76 4 0 02 Aug 2024
SAM 2: Segment Anything in Images and Videos Nikhila Ravi Valentin Gabeur Yuan-Ting Hu Ronghang Hu Chaitanya K. Ryali ... Nicolas Carion Chao-Yuan Wu Ross B. Girshick Piotr Dollár Christoph Feichtenhofer VLM MLLM 172 949 0 01 Aug 2024
Segment Anything for Videos: A Systematic Survey Chunhui Zhang Yawen Cui Weilin Lin Guanjie Huang Yan Rong Li Liu Shiguang Shan VLM 86 8 0 31 Jul 2024
ViLLa: Video Reasoning Segmentation with Large Language Model Rongkun Zheng Lu Qi Xi Chen Yi Wang Kun Wang Yu Qiao Hengshuang Zhao VOS LRM 178 5 0 18 Jul 2024
Diff-Tracker: Text-to-Image Diffusion Models are Unsupervised Trackers Zhengbo Zhang Li Xu Duo Peng Hossein Rahmani Jun Liu 117 10 0 11 Jul 2024
Fusion of Short-term and Long-term Attention for Video Mirror Detection Mingchen Xu Jing Wu Yukun Lai Ze Ji 68 1 0 10 Jul 2024
ActionVOS: Actions as Prompts for Video Object Segmentation Liangyang Ouyang Ruicong Liu Yifei Huang Ryosuke Furuta Yoichi Sato VOS 79 2 0 10 Jul 2024
Dense Monocular Motion Segmentation Using Optical Flow and Pseudo Depth Map: A Zero-Shot Approach Yuxiang Huang Yuhao Chen John S. Zelek MDE 76 2 0 27 Jun 2024
Video Inpainting Localization with Contrastive Learning Zijie Lou Gang Cao Man Lin 100 1 0 25 Jun 2024
Trusted Video Inpainting Localization via Deep Attentive Noise Learning Zijie Lou Gang Cao Man Lin AAML 87 3 0 19 Jun 2024
RMem: Restricted Memory Banks Improve Video Object Segmentation Junbao Zhou Ziqi Pang Yu-Xiong Wang VOS 135 7 0 12 Jun 2024
A Labelled Dataset for Sentiment Analysis of Videos on YouTube, TikTok, and Other Sources about the 2024 Outbreak of Measles Nirmalya Thakur Vanessa Su Mingchen Shao Kesha A. Patel Hongseok Jeong Victoria Knieling Andrew Bian 69 1 0 11 Jun 2024
Temporally Consistent Object Editing in Videos using Extended Attention AmirHossein Zamani Amir G. Aghdam Tiberiu Popa Eugene Belilovsky DiffM 107 1 0 01 Jun 2024
Beyond Traditional Single Object Tracking: A Survey Omar Abdelaziz Mohamed Shehata Mohamed Mohamed 123 1 0 16 May 2024
Global Motion Understanding in Large-Scale Video Object Segmentation Volodymyr Fedynyak Yaroslav Romanus Oles Dobosevych Igor Babin Roman Riazantsev VOS 112 1 0 11 May 2024
DeVOS: Flow-Guided Deformable Transformer for Video Object Segmentation Volodymyr Fedynyak Yaroslav Romanus Bohdan Hlovatskyi Bohdan Sydor Oles Dobosevych Igor Babin Roman Riazantsev VOS 83 3 0 11 May 2024
DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos Wen-Hsuan Chu Lei Ke Katerina Fragkiadaki 3DGS VGen 103 33 0 03 May 2024
Zero-Shot Monocular Motion Segmentation in the Wild by Combining Deep Learning with Geometric Motion Model Fusion Yuxiang Huang Yuhao Chen John S. Zelek 70 1 0 02 May 2024