ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2003.08813
  4. Cited By
Multi-task Collaborative Network for Joint Referring Expression
  Comprehension and Segmentation

Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation

19 March 2020
Gen Luo
Yiyi Zhou
Xiaoshuai Sun
Liujuan Cao
Chenglin Wu
Cheng Deng
Rongrong Ji
    ObjD
ArXivPDFHTML

Papers citing "Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation"

50 / 51 papers shown
Title
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Ziqi Pang
Xin Xu
Yu-Xiong Wang
DiffM
65
0
0
15 Apr 2025
Referring to Any Person
Referring to Any Person
Qing Jiang
Lin Wu
Zhaoyang Zeng
Tianhe Ren
Yuda Xiong
Yihao Chen
Qin Liu
Lei Zhang
145
0
0
11 Mar 2025
Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts
Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts
Shiu-hong Kao
Yu-Wing Tai
Chi-Keung Tang
LRM
MLLM
54
0
0
10 Mar 2025
Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints
Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints
Ming Dai
Jian Li
Jiedong Zhuang
Xian Zhang
Wankou Yang
ObjD
42
1
0
12 Jan 2025
NanoMVG: USV-Centric Low-Power Multi-Task Visual Grounding based on Prompt-Guided Camera and 4D mmWave Radar
NanoMVG: USV-Centric Low-Power Multi-Task Visual Grounding based on Prompt-Guided Camera and 4D mmWave Radar
Runwei Guan
Jianan Liu
Liye Jia
Haocheng Zhao
Shanliang Yao
Xiaohui Zhu
Ka Lok Man
Eng Gee Lim
Jeremy S. Smith
Yutao Yue
49
5
0
30 Aug 2024
Learning Visual Grounding from Generative Vision and Language Model
Learning Visual Grounding from Generative Vision and Language Model
Shijie Wang
Dahun Kim
A. Taalimi
Chen Sun
Weicheng Kuo
ObjD
34
5
0
18 Jul 2024
Object Segmentation from Open-Vocabulary Manipulation Instructions Based
  on Optimal Transport Polygon Matching with Multimodal Foundation Models
Object Segmentation from Open-Vocabulary Manipulation Instructions Based on Optimal Transport Polygon Matching with Multimodal Foundation Models
Takayuki Nishimura
Katsuyuki Kuyo
Motonari Kambara
Komei Sugiura
DiffM
30
0
0
01 Jul 2024
F-LMM: Grounding Frozen Large Multimodal Models
F-LMM: Grounding Frozen Large Multimodal Models
Size Wu
Sheng Jin
Wenwei Zhang
Lumin Xu
Wentao Liu
Wei Li
Chen Change Loy
MLLM
73
12
0
09 Jun 2024
Rethinking Referring Object Removal
Rethinking Referring Object Removal
Xiangtian Xue
Jiasong Wu
Youyong Kong
L. Senhadji
Huazhong Shu
DiffM
32
0
0
14 Mar 2024
EchoTrack: Auditory Referring Multi-Object Tracking for Autonomous
  Driving
EchoTrack: Auditory Referring Multi-Object Tracking for Autonomous Driving
Jiacheng Lin
Jiajun Chen
Kunyu Peng
Xuan He
Zhiyong Li
Rainer Stiefelhagen
Kailun Yang
48
6
0
28 Feb 2024
Collaborative Position Reasoning Network for Referring Image
  Segmentation
Collaborative Position Reasoning Network for Referring Image Segmentation
Jianjian Cao
Beiya Dai
Yulin Li
Xiameng Qin
Jingdong Wang
25
0
0
22 Jan 2024
Jack of All Tasks, Master of Many: Designing General-purpose
  Coarse-to-Fine Vision-Language Model
Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model
Shraman Pramanick
Guangxing Han
Rui Hou
Sayan Nag
Ser-Nam Lim
Nicolas Ballas
Qifan Wang
Rama Chellappa
Amjad Almahairi
VLM
MLLM
42
29
0
19 Dec 2023
See, Say, and Segment: Teaching LMMs to Overcome False Premises
See, Say, and Segment: Teaching LMMs to Overcome False Premises
Tsung-Han Wu
Giscard Biamby
David M. Chan
Lisa Dunlap
Ritwik Gupta
Xudong Wang
Joseph E. Gonzalez
Trevor Darrell
VLM
MLLM
32
18
0
13 Dec 2023
Temporal Collection and Distribution for Referring Video Object
  Segmentation
Temporal Collection and Distribution for Referring Video Object Segmentation
Jiajin Tang
Ge Zheng
Sibei Yang
VOS
36
14
0
07 Sep 2023
A Unified Framework for 3D Point Cloud Visual Grounding
A Unified Framework for 3D Point Cloud Visual Grounding
Haojia Lin
Yongdong Luo
Xiawu Zheng
Lijiang Li
Fei Chao
Taisong Jin
Donghao Luo
Yan Wang
Liujuan Cao
Rongrong Ji
21
2
0
23 Aug 2023
RefSAM: Efficiently Adapting Segmenting Anything Model for Referring
  Video Object Segmentation
RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object Segmentation
Yonglin Li
Jing Zhang
Xiao Teng
Long Lan
VOS
VLM
23
17
0
03 Jul 2023
Extending CLIP's Image-Text Alignment to Referring Image Segmentation
Extending CLIP's Image-Text Alignment to Referring Image Segmentation
Seoyeon Kim
Minguk Kang
Dongwon Kim
Jaesik Park
Suha Kwak
VLM
25
10
0
14 Jun 2023
Referring Camouflaged Object Detection
Referring Camouflaged Object Detection
Xuying Zhang
Bo Yin
Zheng Lin
Qibin Hou
Deng-Ping Fan
Ming-Ming Cheng
38
17
0
13 Jun 2023
LRVS-Fashion: Extending Visual Search with Referring Instructions
LRVS-Fashion: Extending Visual Search with Referring Instructions
Simon Lepage
Jérémie Mary
David Picard
23
1
0
05 Jun 2023
Referred by Multi-Modality: A Unified Temporal Transformer for Video
  Object Segmentation
Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation
Shilin Yan
Renrui Zhang
Ziyu Guo
Wenchao Chen
Wei Zhang
Hongyang Li
Yu Qiao
Hao Dong
Zhongjiang He
Peng Gao
VOS
20
30
0
25 May 2023
Multi-Modal Mutual Attention and Iterative Interaction for Referring
  Image Segmentation
Multi-Modal Mutual Attention and Iterative Interaction for Referring Image Segmentation
Chang Liu
Henghui Ding
Yulun Zhang
Xudong Jiang
21
47
0
24 May 2023
Advancing Referring Expression Segmentation Beyond Single Image
Advancing Referring Expression Segmentation Beyond Single Image
YiXuan Wu
Zhao Zhang
Xie Chi
Feng Zhu
Rui Zhao
VLM
34
18
0
21 May 2023
TreePrompt: Learning to Compose Tree Prompts for Explainable Visual
  Grounding
TreePrompt: Learning to Compose Tree Prompts for Explainable Visual Grounding
Chenchi Zhang
Jun Xiao
Lei Chen
Jian Shao
Long Chen
VLM
LRM
22
2
0
19 May 2023
Semantics-Aware Dynamic Localization and Refinement for Referring Image
  Segmentation
Semantics-Aware Dynamic Localization and Refinement for Referring Image Segmentation
Zhao Yang
Jiaqi Wang
Yansong Tang
Kai-xiang Chen
Hengshuang Zhao
Philip H. S. Torr
36
23
0
11 Mar 2023
Referring Multi-Object Tracking
Referring Multi-Object Tracking
Dongming Wu
Wencheng Han
Tiancai Wang
Xingping Dong
Xiangyu Zhang
Jianbing Shen
24
71
0
06 Mar 2023
Unleashing Text-to-Image Diffusion Models for Visual Perception
Unleashing Text-to-Image Diffusion Models for Visual Perception
Wenliang Zhao
Yongming Rao
Zuyan Liu
Benlin Liu
Jie Zhou
Jiwen Lu
ObjD
VLM
MDE
160
214
0
03 Mar 2023
Semantic Image Segmentation: Two Decades of Research
Semantic Image Segmentation: Two Decades of Research
G. Csurka
Riccardo Volpi
Boris Chidlovskii
3DV
24
50
0
13 Feb 2023
HierVL: Learning Hierarchical Video-Language Embeddings
HierVL: Learning Hierarchical Video-Language Embeddings
Kumar Ashutosh
Rohit Girdhar
Lorenzo Torresani
Kristen Grauman
VLM
AI4TS
20
51
0
05 Jan 2023
What You Say Is What You Show: Visual Narration Detection in
  Instructional Videos
What You Say Is What You Show: Visual Narration Detection in Instructional Videos
Kumar Ashutosh
Rohit Girdhar
Lorenzo Torresani
Kristen Grauman
16
4
0
05 Jan 2023
A Unified Mutual Supervision Framework for Referring Expression
  Segmentation and Generation
A Unified Mutual Supervision Framework for Referring Expression Segmentation and Generation
Shijia Huang
Feng Li
Hao Zhang
Siyi Liu
Lei Zhang
Liwei Wang
22
5
0
15 Nov 2022
YORO -- Lightweight End to End Visual Grounding
YORO -- Lightweight End to End Visual Grounding
Chih-Hui Ho
Srikar Appalaraju
Bhavan A. Jasani
R. Manmatha
Nuno Vasconcelos
ObjD
21
21
0
15 Nov 2022
UniColor: A Unified Framework for Multi-Modal Colorization with
  Transformer
UniColor: A Unified Framework for Multi-Modal Colorization with Transformer
Zhitong Huang
Nanxuan Zhao
Jing Liao
ViT
15
16
0
22 Sep 2022
PPMN: Pixel-Phrase Matching Network for One-Stage Panoptic Narrative
  Grounding
PPMN: Pixel-Phrase Matching Network for One-Stage Panoptic Narrative Grounding
Zihan Ding
Zixiang Ding
Tianrui Hui
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
Si Liu
12
12
0
11 Aug 2022
Referring Image Matting
Referring Image Matting
Jizhizi Li
Jing Zhang
Dacheng Tao
ObjD
VLM
18
22
0
10 Jun 2022
Modeling Motion with Multi-Modal Features for Text-Based Video
  Segmentation
Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation
Wangbo Zhao
Kai Wang
Xiangxiang Chu
Fuzhao Xue
Xinchao Wang
Yang You
29
21
0
06 Apr 2022
TubeDETR: Spatio-Temporal Video Grounding with Transformers
TubeDETR: Spatio-Temporal Video Grounding with Transformers
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
ViT
28
94
0
30 Mar 2022
Emergence of hierarchical reference systems in multi-agent communication
Emergence of hierarchical reference systems in multi-agent communication
Xenia Ohmer
M. Duda
Elia Bruni
17
8
0
24 Mar 2022
Local-Global Context Aware Transformer for Language-Guided Video
  Segmentation
Local-Global Context Aware Transformer for Language-Guided Video Segmentation
Chen Liang
Wenguan Wang
Tianfei Zhou
Jiaxu Miao
Yawei Luo
Yi Yang
VOS
24
74
0
18 Mar 2022
Unpaired Referring Expression Grounding via Bidirectional Cross-Modal
  Matching
Unpaired Referring Expression Grounding via Bidirectional Cross-Modal Matching
Hengcan Shi
Munawar Hayat
Jianfei Cai
ObjD
18
10
0
18 Jan 2022
Language as Queries for Referring Video Object Segmentation
Language as Queries for Referring Video Object Segmentation
Jiannan Wu
Yi-Xin Jiang
Pei Sun
Zehuan Yuan
Ping Luo
23
141
0
03 Jan 2022
CRIS: CLIP-Driven Referring Image Segmentation
CRIS: CLIP-Driven Referring Image Segmentation
Zhaoqing Wang
Yu Lu
Qiang Li
Xunqiang Tao
Yan Guo
Ming Gong
Tongliang Liu
VLM
38
359
0
30 Nov 2021
MaIL: A Unified Mask-Image-Language Trimodal Network for Referring Image
  Segmentation
MaIL: A Unified Mask-Image-Language Trimodal Network for Referring Image Segmentation
Zizhang Li
Mengmeng Wang
Jianbiao Mei
Yong Liu
18
18
0
21 Nov 2021
Audio-Visual Grounding Referring Expression for Robotic Manipulation
Audio-Visual Grounding Referring Expression for Robotic Manipulation
Yefei Wang
Kaili Wang
Yi Wang
Di Guo
Huaping Liu
F. Sun
30
12
0
22 Sep 2021
Panoptic Narrative Grounding
Panoptic Narrative Grounding
Cristina González
Nicolás Ayobi
Isabela Hernández
José Hernández
Jordi Pont-Tuset
Pablo Arbeláez
74
22
0
10 Sep 2021
YouRefIt: Embodied Reference Understanding with Language and Gesture
YouRefIt: Embodied Reference Understanding with Language and Gesture
Yixin Chen
Qing Li
Deqian Kong
Yik Lun Kei
Song-Chun Zhu
Tao Gao
Yixin Zhu
Siyuan Huang
LM&Ro
35
41
0
08 Sep 2021
Ultra-high Resolution Image Segmentation via Locality-aware Context
  Fusion and Alternating Local Enhancement
Ultra-high Resolution Image Segmentation via Locality-aware Context Fusion and Alternating Local Enhancement
Wenxi Liu
Qi Li
Xin Lin
Weixiang Yang
Shengfeng He
Yuanlong Yu
24
7
0
06 Sep 2021
Vision-Language Transformer and Query Generation for Referring
  Segmentation
Vision-Language Transformer and Query Generation for Referring Segmentation
Henghui Ding
Chang-rui Liu
Suchen Wang
Xudong Jiang
40
251
0
12 Aug 2021
I3CL:Intra- and Inter-Instance Collaborative Learning for
  Arbitrary-shaped Scene Text Detection
I3CL:Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection
Bo Du
Jian Ye
Jing Zhang
Juhua Liu
Dacheng Tao
VLM
26
29
0
03 Aug 2021
Referring Transformer: A One-step Approach to Multi-task Visual
  Grounding
Referring Transformer: A One-step Approach to Multi-task Visual Grounding
Muchen Li
Leonid Sigal
ObjD
10
187
0
06 Jun 2021
Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD
  Images
Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD Images
Haolin Liu
Anran Lin
Xiaoguang Han
Lei Yang
Yizhou Yu
Shuguang Cui
21
39
0
14 Mar 2021
12
Next