Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.14821
Cited By
End-to-End Referring Video Object Segmentation with Multimodal Transformers
29 November 2021
Adam Botach
Evgenii Zheltonozhskii
Chaim Baskin
VOS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"End-to-End Referring Video Object Segmentation with Multimodal Transformers"
50 / 88 papers shown
Title
RefComp: A Reference-guided Unified Framework for Unpaired Point Cloud Completion
Yixuan Yang
Jinyu Yang
Zixiang Zhao
Victor Sanchez
Feng Zheng
34
0
0
18 Apr 2025
Few-Shot Referring Video Single- and Multi-Object Segmentation via Cross-Modal Affinity with Instance Sequence Matching
Heng Liu
Guanghui Li
Mingqi Gao
Xiantong Zhen
Feng Zheng
Yixuan Wang
VOS
50
0
0
18 Apr 2025
The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation
Hao Fang
Runmin Cong
Xiankai Lu
Z. Chen
Wei Zhang
29
0
0
07 Apr 2025
ReferDINO-Plus: 2nd Solution for 4th PVUW MeViS Challenge at CVPR 2025
Tianming Liang
Haichao Jiang
Wei-Shi Zheng
Jian-Fang Hu
44
0
0
30 Mar 2025
VRMDiff: Text-Guided Video Referring Matting Generation of Diffusion
Lehan Yang
Jincen Song
Tianlong Wang
Daiqing Qi
Weili Shi
Yuheng Liu
Sheng Li
DiffM
VOS
VGen
74
0
0
11 Mar 2025
Just Functioning as a Hook for Two-Stage Referring Multi-Object Tracking
Weize Li
Yunhao Du
Qixiang Yin
Zhicheng Zhao
Fei Su
Daqi Liu
64
0
0
10 Mar 2025
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation
Suhwan Cho
Seunghoon Lee
Minhyeok Lee
Jungho Lee
Sangyoun Lee
VOS
77
0
0
05 Mar 2025
MEX: Memory-efficient Approach to Referring Multi-Object Tracking
Huu-Thien Tran
Phuoc-Sang Pham
Thai-Son Tran
Khoa Luu
VOT
81
1
0
20 Feb 2025
MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation
Fu Rong
Meng Lan
Q. Zhang
L. Zhang
VOS
VGen
73
1
0
23 Jan 2025
The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
Sitong Gong
Yunzhi Zhuge
Lu Zhang
Z. Yang
Pingping Zhang
Huchuan Lu
41
0
0
15 Jan 2025
InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models
Cong Wei
Yujie Zhong
Haoxian Tan
Yingsen Zeng
Y. Liu
Zheng Zhao
Yujiu Yang
MLLM
VLM
VOS
101
2
0
18 Dec 2024
Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Yunkai Dang
Kaichen Huang
Jiahao Huo
Yibo Yan
S. Huang
...
Kun Wang
Yong Liu
Jing Shao
Hui Xiong
Xuming Hu
LRM
101
15
0
03 Dec 2024
Referring Video Object Segmentation via Language-aligned Track Selection
Seongchan Kim
Woojeong Jin
Sangbeom Lim
Heeji Yoon
Hyunwook Choi
Seungryong Kim
VOS
94
0
0
02 Dec 2024
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation
Claudia Cuttano
Gabriele Trivigno
Gabriele Rosi
Carlo Masone
Giuseppe Averta
VOS
106
2
0
26 Nov 2024
One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos
Zechen Bai
Tong He
Haiyang Mei
Pichao Wang
Ziteng Gao
Joya Chen
Lei Liu
Zheng Zhang
Mike Zheng Shou
VLM
VOS
MLLM
45
17
0
29 Sep 2024
Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation
Shaofei Huang
Rui Ling
Hongyu Li
Tianrui Hui
Zongheng Tang
Xiaoming Wei
Jizhong Han
Si Liu
VOS
37
4
0
28 Aug 2024
UNINEXT-Cutie: The 1st Solution for LSVOS Challenge RVOS Track
Hao Fang
Feiyu Pan
Xiankai Lu
Wei Zhang
Runmin Cong
35
3
0
19 Aug 2024
ViLLa: Video Reasoning Segmentation with Large Language Model
Rongkun Zheng
Lu Qi
Xi Chen
Yi Wang
Kun Wang
Yu Qiao
Hengshuang Zhao
VOS
LRM
72
2
0
18 Jul 2024
VISA: Reasoning Video Object Segmentation via Large Language Models
Cilin Yan
Haochen Wang
Shilin Yan
Xiaolong Jiang
Yao Hu
Guoliang Kang
Weidi Xie
E. Gavves
LRM
VLM
VOS
45
28
0
16 Jul 2024
Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes
Yaoting Wang
Peiwen Sun
Dongzhan Zhou
Guangyao Li
Honggang Zhang
Di Hu
VOS
40
5
0
15 Jul 2024
ActionVOS: Actions as Prompts for Video Object Segmentation
Liangyang Ouyang
Ruicong Liu
Yifei Huang
Ryosuke Furuta
Yoichi Sato
VOS
42
2
0
10 Jul 2024
SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation
Sayan Nag
Koustava Goswami
Srikrishna Karanam
44
2
0
02 Jul 2024
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding
Tao Zhang
Xiangtai Li
Hao Fei
Haobo Yuan
Shengqiong Wu
Shunping Ji
Chen Change Loy
Shuicheng Yan
LRM
MLLM
VLM
49
48
0
27 Jun 2024
2nd Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation
Bin Cao
Yisi Zhang
Xuanxu Lin
Xingjian He
Bo-Lu Zhao
Jing Liu
61
2
0
20 Jun 2024
GroPrompt: Efficient Grounded Prompting and Adaptation for Referring Video Object Segmentation
Ci-Siang Lin
I-Jieh Liu
Min-Hung Chen
Chien-Yi Wang
Sifei Liu
Yu-Chiang Frank Wang
VOS
53
0
0
18 Jun 2024
1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation
Mingqi Gao
Jingnan Luo
Jinyu Yang
Jungong Han
Feng Zheng
42
2
0
11 Jun 2024
3rd Place Solution for MeViS Track in CVPR 2024 PVUW workshop: Motion Expression guided Video Segmentation
Feiyu Pan
Hao Fang
Xiankai Lu
34
3
0
07 Jun 2024
Deep video representation learning: a survey
Elham Ravanbakhsh
Yongqing Liang
J. Ramanujam
Xin Li
49
3
0
10 May 2024
All in One Framework for Multimodal Re-identification in the Wild
He Li
Mang Ye
Ming Zhang
Bo Du
35
9
0
08 May 2024
LVOS: A Benchmark for Large-scale Long-term Video Object Segmentation
Lingyi Hong
Zhongying Liu
Wenchao Chen
Chenzhi Tan
Yuang Feng
...
Jinglun Li
Zhaoyu Chen
Shuyong Gao
Wei Zhang
Wenqiang Zhang
VLM
VOS
42
12
0
30 Apr 2024
SCOUT+: Towards Practical Task-Driven Drivers' Gaze Prediction
Iuliia Kotseruba
John K. Tsotsos
33
1
0
12 Apr 2024
Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation
Shuting He
Henghui Ding
VOS
35
23
0
04 Apr 2024
Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation
Zixin Zhu
Xuelu Feng
Dongdong Chen
Junsong Yuan
Chunming Qiao
Gang Hua
DiffM
42
7
0
18 Mar 2024
R
2
\text{R}^2
R
2
-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations
Xiang Li
Kai Qiu
Jinglu Wang
Xiaohao Xu
Rita Singh
Kashu Yamazaki
Hao Chen
Xiaonan Huang
Bhiksha Raj
VOS
40
1
0
07 Mar 2024
Dr
2
^2
2
Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning
Chen Zhao
Shuming Liu
K. Mangalam
Guocheng Qian
Fatimah Zohra
Abdulmohsen Alghannam
Jitendra Malik
Guohao Li
51
3
0
08 Jan 2024
1st Place Solution for 5th LSVOS Challenge: Referring Video Object Segmentation
Zhuoyan Luo
Yicheng Xiao
Yong Liu
Yitong Wang
Yansong Tang
Xiu Li
Yujiu Yang
VOS
33
2
0
01 Jan 2024
Tracking with Human-Intent Reasoning
Jiawen Zhu
Zhi-Qi Cheng
Jun-Yan He
Chenyang Li
Bin Luo
Huchuan Lu
Yifeng Geng
Xuansong Xie
LRM
VOS
37
7
0
29 Dec 2023
UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces
Jiannan Wu
Yi-Xin Jiang
Bin Yan
Huchuan Lu
Zehuan Yuan
Ping Luo
VOS
37
17
0
25 Dec 2023
iKUN: Speak to Trackers without Retraining
Yunhao Du
Cheng Lei
Zhicheng Zhao
Fei Su
VOT
29
12
0
25 Dec 2023
Universal Segmentation at Arbitrary Granularity with Language Instruction
Yong Liu
Cairong Zhang
Yitong Wang
Jiahao Wang
Yujiu Yang
Yansong Tang
VLM
VOS
55
15
0
04 Dec 2023
Sketch-based Video Object Segmentation: Benchmark and Analysis
Ruolin Yang
Da Li
Conghui Hu
Timothy M. Hospedales
Honggang Zhang
Yi-Zhe Song
VOS
35
1
0
13 Nov 2023
ISAR: A Benchmark for Single- and Few-Shot Object Instance Segmentation and Re-Identification
Nicolas Gorlo
Kenneth Blomqvist
Francesco Milano
Roland Siegwart
VLM
26
2
0
05 Nov 2023
Cross-modal Cognitive Consensus guided Audio-Visual Segmentation
Zhaofeng Shi
Qingbo Wu
Fanman Meng
Linfeng Xu
Hongliang Li
VOS
30
3
0
10 Oct 2023
CoralVOS: Dataset and Benchmark for Coral Video Segmentation
Ziqiang Zheng
Yaofeng Xie
Haixin Liang
Zhibin Yu
Sai-Kit Yeung
VOS
36
7
0
03 Oct 2023
QDFormer: Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition
Xiang Li
Jinglu Wang
Xiaohao Xu
Xiulian Peng
Rita Singh
Yan Lu
Bhiksha Raj
VOS
39
10
0
29 Sep 2023
Adversarial Attacks on Video Object Segmentation with Hard Region Discovery
P. Li
Yu Zhang
L. Yuan
Jian Zhao
Xianghua Xu
Xiaoqing Zhang
AAML
VOS
40
11
0
25 Sep 2023
Fully Transformer-Equipped Architecture for End-to-End Referring Video Object Segmentation
P. Li
Yu Zhang
L. Yuan
Xianghua Xu
VOS
26
6
0
21 Sep 2023
Discovering Sounding Objects by Audio Queries for Audio Visual Segmentation
Shaofei Huang
Han Li
Yuqing Wang
Hongji Zhu
Jiao Dai
Jizhong Han
Wenge Rong
Si Liu
VOS
25
16
0
18 Sep 2023
Temporal Collection and Distribution for Referring Video Object Segmentation
Jiajin Tang
Ge Zheng
Sibei Yang
VOS
36
14
0
07 Sep 2023
Learning Cross-Modal Affinity for Referring Video Object Segmentation Targeting Limited Samples
Guanghui Li
Mingqi Gao
Heng Liu
Xiantong Zhen
Feng Zheng
VOS
28
3
0
05 Sep 2023
1
2
Next