ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.14821
  4. Cited By
End-to-End Referring Video Object Segmentation with Multimodal
  Transformers

End-to-End Referring Video Object Segmentation with Multimodal Transformers

29 November 2021
Adam Botach
Evgenii Zheltonozhskii
Chaim Baskin
    VOS
ArXivPDFHTML

Papers citing "End-to-End Referring Video Object Segmentation with Multimodal Transformers"

38 / 88 papers shown
Title
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D
  Understanding, Generation, and Instruction Following
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following
Ziyu Guo
Renrui Zhang
Xiangyang Zhu
Yiwen Tang
Xianzheng Ma
...
Ke Chen
Peng Gao
Xianzhi Li
Hongsheng Li
Pheng-Ann Heng
MLLM
32
125
0
01 Sep 2023
Video-Instrument Synergistic Network for Referring Video Instrument
  Segmentation in Robotic Surgery
Video-Instrument Synergistic Network for Referring Video Instrument Segmentation in Robotic Surgery
Hongqiu Wang
Lei Zhu
Guang Yang
Yi-Ting Guo
Shenmin Zhang
Bo Xu
Yueming Jin
VOS
28
0
0
18 Aug 2023
MeViS: A Large-scale Benchmark for Video Segmentation with Motion
  Expressions
MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
Henghui Ding
Chang Liu
Shuting He
Xudong Jiang
Chen Change Loy
VOS
44
101
0
16 Aug 2023
Visual and Textual Prior Guided Mask Assemble for Few-Shot Segmentation
  and Beyond
Visual and Textual Prior Guided Mask Assemble for Few-Shot Segmentation and Beyond
Chen Shuai
Meng Fanman
Runtong Zhang
Heqian Qiu
Hongliang Li
Wu Qingbo
Xu Linfeng
VLM
30
12
0
15 Aug 2023
ICAFusion: Iterative Cross-Attention Guided Feature Fusion for
  Multispectral Object Detection
ICAFusion: Iterative Cross-Attention Guided Feature Fusion for Multispectral Object Detection
Jifeng Shen
Yifei Chen
Yue Liu
Xin Zuo
Heng Fan
Wankou Yang
ViT
29
89
0
15 Aug 2023
Learning Referring Video Object Segmentation from Weak Annotation
Learning Referring Video Object Segmentation from Weak Annotation
Wangbo Zhao
Ke Nan
Songyang Zhang
Kai-xiang Chen
Dahua Lin
Yang You
VOS
30
2
0
04 Aug 2023
Spectrum-guided Multi-granularity Referring Video Object Segmentation
Spectrum-guided Multi-granularity Referring Video Object Segmentation
Bo Miao
Bennamoun
Yongsheng Gao
Ajmal Saeed Mian
VOS
42
34
0
25 Jul 2023
OnlineRefer: A Simple Online Baseline for Referring Video Object
  Segmentation
OnlineRefer: A Simple Online Baseline for Referring Video Object Segmentation
Dongming Wu
Tiancai Wang
Yuang Zhang
Xiangyu Zhang
Jianbing Shen
VOS
35
33
0
18 Jul 2023
RefSAM: Efficiently Adapting Segmenting Anything Model for Referring
  Video Object Segmentation
RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object Segmentation
Yonglin Li
Jing Zhang
Xiao Teng
Long Lan
VOS
VLM
23
17
0
03 Jul 2023
Bidirectional Correlation-Driven Inter-Frame Interaction Transformer for
  Referring Video Object Segmentation
Bidirectional Correlation-Driven Inter-Frame Interaction Transformer for Referring Video Object Segmentation
Meng Lan
Fu Rong
Zuchao Li
Wei Yu
L. Zhang
VOS
31
5
0
02 Jul 2023
LoSh: Long-Short Text Joint Prediction Network for Referring Video
  Object Segmentation
LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation
Linfeng Yuan
Miaojing Shi
Zijie Yue
Qijun Chen
VOS
29
8
0
14 Jun 2023
MarineVRS: Marine Video Retrieval System with Explainability via
  Semantic Understanding
MarineVRS: Marine Video Retrieval System with Explainability via Semantic Understanding
Tan-Sang Ha
Hai Nguyen-Truong
Tuan-Anh Vu
Sai-Kit Yeung
31
0
0
07 Jun 2023
LRVS-Fashion: Extending Visual Search with Referring Instructions
LRVS-Fashion: Extending Visual Search with Referring Instructions
Simon Lepage
Jérémie Mary
David Picard
25
1
0
05 Jun 2023
SOC: Semantic-Assisted Object Cluster for Referring Video Object
  Segmentation
SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation
Zhuoyan Luo
Yicheng Xiao
Yong-Jin Liu
Shuyan Li
Yitong Wang
Yansong Tang
Xiu Li
Yujiu Yang
VOS
28
32
0
26 May 2023
Referred by Multi-Modality: A Unified Temporal Transformer for Video
  Object Segmentation
Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation
Shilin Yan
Renrui Zhang
Ziyu Guo
Wenchao Chen
Wei Zhang
Hongyang Li
Yu Qiao
Hao Dong
Zhongjiang He
Peng Gao
VOS
22
30
0
25 May 2023
Annotation-free Audio-Visual Segmentation
Annotation-free Audio-Visual Segmentation
Jinxian Liu
Yu Wang
Chen Ju
Chaofan Ma
Ya-Qin Zhang
Weidi Xie
VOS
VLM
36
28
0
18 May 2023
AdaptiveClick: Clicks-aware Transformer with Adaptive Focal Loss for
  Interactive Image Segmentation
AdaptiveClick: Clicks-aware Transformer with Adaptive Focal Loss for Interactive Image Segmentation
Jiacheng Lin
Jiajun Chen
Kailun Yang
Alina Roitberg
Siyu Li
Zhiyong Li
Shutao Li
34
16
0
07 May 2023
Transformer-Based Visual Segmentation: A Survey
Transformer-Based Visual Segmentation: A Survey
Xiangtai Li
Henghui Ding
Haobo Yuan
Wenwei Zhang
Jiangmiao Pang
Guangliang Cheng
Kai-xiang Chen
Ziwei Liu
Chen Change Loy
ViT
MedIm
42
132
0
19 Apr 2023
Sketch-based Video Object Localization
Sketch-based Video Object Localization
Sangmin Woo
So-Yeong Jeon
Jinyoung Park
Minji Son
Sumin Lee
Changick Kim
16
0
0
02 Apr 2023
Universal Instance Perception as Object Discovery and Retrieval
Universal Instance Perception as Object Discovery and Retrieval
B. Yan
Yi-Xin Jiang
Jiannan Wu
D. Wang
Ping Luo
Zehuan Yuan
Huchuan Lu
VOS
VLM
LRM
35
161
0
12 Mar 2023
Multimodal Prompting with Missing Modalities for Visual Recognition
Multimodal Prompting with Missing Modalities for Visual Recognition
Yi-Lun Lee
Yi-Hsuan Tsai
Wei-Chen Chiu
Chen-Yu Lee
VPVLM
27
94
0
06 Mar 2023
Referring Multi-Object Tracking
Referring Multi-Object Tracking
Dongming Wu
Wencheng Han
Tiancai Wang
Xingping Dong
Xiangyu Zhang
Jianbing Shen
34
71
0
06 Mar 2023
MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
Henghui Ding
Chang Liu
Shuting He
Xudong Jiang
Philip H. S. Torr
S. Bai
VOS
27
132
0
03 Feb 2023
Audio-Visual Segmentation with Semantics
Audio-Visual Segmentation with Semantics
Jinxing Zhou
Xuyang Shen
Jianyuan Wang
Jiayi Zhang
Weixuan Sun
...
Stan Birchfield
Dan Guo
Lingpeng Kong
Meng Wang
Yiran Zhong
VOS
46
37
0
30 Jan 2023
Hybrid Transformer Based Feature Fusion for Self-Supervised Monocular
  Depth Estimation
Hybrid Transformer Based Feature Fusion for Self-Supervised Monocular Depth Estimation
S. Tomar
Maitreya Suin
A. N. Rajagopalan
ViT
MDE
21
4
0
20 Nov 2022
Visual Semantic Segmentation Based on Few/Zero-Shot Learning: An
  Overview
Visual Semantic Segmentation Based on Few/Zero-Shot Learning: An Overview
Wenqi Ren
Yang Tang
Qiyu Sun
Chaoqiang Zhao
Qing‐Long Han
VLM
18
41
0
13 Nov 2022
Monocular Dynamic View Synthesis: A Reality Check
Monocular Dynamic View Synthesis: A Reality Check
Han Gao
Ruilong Li
Shubham Tulsiani
Bryan C. Russell
Angjoo Kanazawa
29
111
0
24 Oct 2022
Towards Robust Referring Image Segmentation
Towards Robust Referring Image Segmentation
Jianzong Wu
Xiangtai Li
Xia Li
Henghui Ding
Yu Tong
Dacheng Tao
3DV
34
40
0
20 Sep 2022
Fusion of Satellite Images and Weather Data with Transformer Networks
  for Downy Mildew Disease Detection
Fusion of Satellite Images and Weather Data with Transformer Networks for Downy Mildew Disease Detection
William Maillet
Maryam Ouhami
A. Hafiane
ViT
MedIm
19
6
0
06 Sep 2022
Multi-Attention Network for Compressed Video Referring Object
  Segmentation
Multi-Attention Network for Compressed Video Referring Object Segmentation
Weidong Chen
Dexiang Hong
Yuankai Qi
Zhenjun Han
Shuhui Wang
Laiyun Qing
Qingming Huang
Guorong Li
VOS
20
35
0
26 Jul 2022
Online Video Instance Segmentation via Robust Context Fusion
Online Video Instance Segmentation via Robust Context Fusion
Xiang Li
Jinglu Wang
Xiaohao Xu
Bhiksha Raj
Yan Lu
35
5
0
12 Jul 2022
CRFormer: A Cross-Region Transformer for Shadow Removal
CRFormer: A Cross-Region Transformer for Shadow Removal
J. Wan
Hui Yin
Zhenyao Wu
Xinyi Wu
Zhihao Liu
Song Wang
ViT
42
15
0
04 Jul 2022
Towards Robust Referring Video Object Segmentation with Cyclic
  Relational Consensus
Towards Robust Referring Video Object Segmentation with Cyclic Relational Consensus
Xiang Li
Jinglu Wang
Xiaohao Xu
Xiao Li
Bhiksha Raj
Yan Lu
VOS
53
29
0
04 Jul 2022
The Second Place Solution for The 4th Large-scale Video Object
  Segmentation Challenge--Track 3: Referring Video Object Segmentation
The Second Place Solution for The 4th Large-scale Video Object Segmentation Challenge--Track 3: Referring Video Object Segmentation
Leilei Cao
Zhuang Li
Bo Yan
Feng Zhang
Fengliang Qi
Yucheng Hu
Hongbin Wang
VOS
11
1
0
24 Jun 2022
Multimodal Learning with Transformers: A Survey
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
60
527
0
13 Jun 2022
Local-Global Context Aware Transformer for Language-Guided Video
  Segmentation
Local-Global Context Aware Transformer for Language-Guided Video Segmentation
Chen Liang
Wenguan Wang
Tianfei Zhou
Jiaxu Miao
Yawei Luo
Yi Yang
VOS
29
74
0
18 Mar 2022
Language as Queries for Referring Video Object Segmentation
Language as Queries for Referring Video Object Segmentation
Jiannan Wu
Yi-Xin Jiang
Pei Sun
Zehuan Yuan
Ping Luo
23
141
0
03 Jan 2022
Conditional Convolutions for Instance Segmentation
Conditional Convolutions for Instance Segmentation
Zhi Tian
Chunhua Shen
Hao Chen
ISeg
182
597
0
12 Mar 2020
Previous
12