ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2003.00403
  4. Cited By
Cops-Ref: A new Dataset and Task on Compositional Referring Expression
  Comprehension

Cops-Ref: A new Dataset and Task on Compositional Referring Expression Comprehension

1 March 2020
Zhenfang Chen
Peng Wang
Lin Ma
Kwan-Yee K. Wong
Qi Wu
    ObjD
ArXivPDFHTML

Papers citing "Cops-Ref: A new Dataset and Task on Compositional Referring Expression Comprehension"

43 / 43 papers shown
Title
RoboOS: A Hierarchical Embodied Framework for Cross-Embodiment and Multi-Agent Collaboration
RoboOS: A Hierarchical Embodied Framework for Cross-Embodiment and Multi-Agent Collaboration
Huajie Tan
Xiaoshuai Hao
Minglan Lin
Pengwei Wang
Yaoxu Lyu
Mingyu Cao
Zhongyuan Wang
Shanghang Zhang
LM&Ro
50
0
0
06 May 2025
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
Tao Wang
Changxu Cheng
Lingfeng Wang
Senda Chen
Wuyue Zhao
VLM
72
0
0
17 Mar 2025
DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding
DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding
Xinyu Ma
Ziyang Ding
Zhicong Luo
Cen Chen
Zonghao Guo
Derek F. Wong
Xiaoyi Feng
Maosong Sun
VLM
LRM
76
1
0
17 Mar 2025
Referring to Any Person
Referring to Any Person
Qing Jiang
Lin Wu
Zhaoyang Zeng
Tianhe Ren
Yuda Xiong
Yihao Chen
Qin Liu
Lei Zhang
229
0
0
11 Mar 2025
New Dataset and Methods for Fine-Grained Compositional Referring Expression Comprehension via Specialist-MLLM Collaboration
New Dataset and Methods for Fine-Grained Compositional Referring Expression Comprehension via Specialist-MLLM Collaboration
X. J. Yang
Jun Liu
Peng Wang
Guoqing Wang
Yuqing Yang
H. Shen
ObjD
94
0
0
27 Feb 2025
Accounting for Focus Ambiguity in Visual Questions
Chongyan Chen
Yu-Yun Tseng
Zhuoheng Li
Anush Venkatesh
Danna Gurari
46
0
0
04 Jan 2025
Towards Visual Grounding: A Survey
Towards Visual Grounding: A Survey
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
67
4
0
31 Dec 2024
FineCops-Ref: A new Dataset and Task for Fine-Grained Compositional Referring Expression Comprehension
FineCops-Ref: A new Dataset and Task for Fine-Grained Compositional Referring Expression Comprehension
Junzhuo Liu
Xiaohu Yang
Weiwei Li
Peng Wang
ObjD
56
3
0
23 Sep 2024
Revisiting Referring Expression Comprehension Evaluation in the Era of
  Large Multimodal Models
Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal Models
Jierun Chen
Fangyun Wei
Jinjing Zhao
Sizhe Song
Bohuai Wu
Zhuoxuan Peng
S.-H. Gary Chan
Hongyang R. Zhang
47
8
0
24 Jun 2024
WaterVG: Waterway Visual Grounding based on Text-Guided Vision and
  mmWave Radar
WaterVG: Waterway Visual Grounding based on Text-Guided Vision and mmWave Radar
Runwei Guan
Liye Jia
Fengyufan Yang
Shanliang Yao
Erick Purwanto
...
Eng Gee Lim
Jeremy S. Smith
Ka Lok Man
Xuming Hu
Yutao Yue
40
9
0
19 Mar 2024
GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language
  Pre-training and Open-Vocabulary Object Detection
GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection
Haozhan Shen
Tiancheng Zhao
Mingwei Zhu
Jianwei Yin
VLM
ObjD
99
11
0
22 Dec 2023
Language-guided Robot Grasping: CLIP-based Referring Grasp Synthesis in
  Clutter
Language-guided Robot Grasping: CLIP-based Referring Grasp Synthesis in Clutter
Georgios Tziafas
Yucheng Xu
Arushi Goel
M. Kasaei
Zhibin Li
H. Kasaei
35
24
0
09 Nov 2023
GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and
  reusing ModulEs
GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs
Zhenfang Chen
Rui Sun
Wenjun Liu
Yining Hong
Chuang Gan
LRM
28
14
0
08 Nov 2023
Enhancing Multimodal Compositional Reasoning of Visual Language Models
  with Generative Negative Mining
Enhancing Multimodal Compositional Reasoning of Visual Language Models with Generative Negative Mining
U. Sahin
Hang Li
Qadeer Ahmad Khan
Daniel Cremers
Volker Tresp
VLM
CoGe
28
12
0
07 Nov 2023
TextPSG: Panoptic Scene Graph Generation from Textual Descriptions
TextPSG: Panoptic Scene Graph Generation from Textual Descriptions
Chengyang Zhao
Songlin Yang
Zhenfang Chen
Mingyu Ding
Chuang Gan
54
15
0
10 Oct 2023
InstructDET: Diversifying Referring Object Detection with Generalized
  Instructions
InstructDET: Diversifying Referring Object Detection with Generalized Instructions
Ronghao Dang
Jiangyan Feng
Haodong Zhang
Chongjian Ge
Lin Song
...
Chengju Liu
Qi Chen
Feng Zhu
Rui Zhao
Yibing Song
ObjD
32
11
0
08 Oct 2023
Dense Object Grounding in 3D Scenes
Dense Object Grounding in 3D Scenes
Wencan Huang
Daizong Liu
Wei Hu
13
17
0
05 Sep 2023
VL-Grasp: a 6-Dof Interactive Grasp Policy for Language-Oriented Objects
  in Cluttered Indoor Scenes
VL-Grasp: a 6-Dof Interactive Grasp Policy for Language-Oriented Objects in Cluttered Indoor Scenes
Yuhao Lu
Yixuan Fan
Beixing Deng
F. Liu
Yali Li
Shengjin Wang
38
29
0
01 Aug 2023
Described Object Detection: Liberating Object Detection with Flexible
  Expressions
Described Object Detection: Liberating Object Detection with Flexible Expressions
Chi Xie
Zhao Zhang
YiXuan Wu
Feng Zhu
Rui Zhao
Shuang Liang
ObjD
39
30
0
24 Jul 2023
Advancing Visual Grounding with Scene Knowledge: Benchmark and Method
Advancing Visual Grounding with Scene Knowledge: Benchmark and Method
Zhihong Chen
Ruifei Zhang
Yibing Song
Xiang Wan
Guanbin Li
24
15
0
21 Jul 2023
ICSVR: Investigating Compositional and Syntactic Understanding in Video
  Retrieval Models
ICSVR: Investigating Compositional and Syntactic Understanding in Video Retrieval Models
Avinash Madasu
Vasudev Lal
CoGe
44
3
0
28 Jun 2023
Large Language Models as Commonsense Knowledge for Large-Scale Task
  Planning
Large Language Models as Commonsense Knowledge for Large-Scale Task Planning
Zirui Zhao
W. Lee
David Hsu
LRM
LLMAG
LM&Ro
41
200
0
23 May 2023
Embodied Concept Learner: Self-supervised Learning of Concepts and
  Mapping through Instruction Following
Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following
Mingyu Ding
Yan Xu
Zhenfang Chen
David D. Cox
Ping Luo
J. Tenenbaum
Chuang Gan
LM&Ro
64
21
0
07 Apr 2023
3D Concept Learning and Reasoning from Multi-View Images
3D Concept Learning and Reasoning from Multi-View Images
Yining Hong
Chun-Tse Lin
Yilun Du
Zhenfang Chen
J. Tenenbaum
Chuang Gan
3DV
30
52
0
20 Mar 2023
PACO: Parts and Attributes of Common Objects
PACO: Parts and Attributes of Common Objects
Vignesh Ramanathan
Anmol Kalia
Vladan Petrovic
Yiqian Wen
Baixue Zheng
...
Abhishek Kadian
Amir Mousavi
Yi-Zhe Song
Abhimanyu Dubey
D. Mahajan
VLM
30
95
0
04 Jan 2023
CREPE: Can Vision-Language Foundation Models Reason Compositionally?
CREPE: Can Vision-Language Foundation Models Reason Compositionally?
Zixian Ma
Jerry Hong
Mustafa Omer Gul
Mona Gandhi
Irena Gao
Ranjay Krishna
CoGe
34
125
0
13 Dec 2022
Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
51
76
0
17 Nov 2022
YORO -- Lightweight End to End Visual Grounding
YORO -- Lightweight End to End Visual Grounding
Chih-Hui Ho
Srikar Appalaraju
Bhavan A. Jasani
R. Manmatha
Nuno Vasconcelos
ObjD
21
21
0
15 Nov 2022
RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing
  Data
RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data
Yangfan Zhan
Zhitong Xiong
Yuan. Yuan
78
107
0
23 Oct 2022
Do Vision-and-Language Transformers Learn Grounded Predicate-Noun
  Dependencies?
Do Vision-and-Language Transformers Learn Grounded Predicate-Noun Dependencies?
Mitja Nikolaus
Emmanuelle Salin
Stéphane Ayache
Abdellah Fourtassi
Benoit Favre
19
14
0
21 Oct 2022
RefCrowd: Grounding the Target in Crowd with Referring Expressions
RefCrowd: Grounding the Target in Crowd with Referring Expressions
Heqian Qiu
Hongliang Li
Taijin Zhao
Lanxiao Wang
Qingbo Wu
Fanman Meng
ObjD
27
6
0
16 Jun 2022
Referring Image Matting
Referring Image Matting
Jizhizi Li
Jing Zhang
Dacheng Tao
ObjD
VLM
29
23
0
10 Jun 2022
Fixing Malfunctional Objects With Learned Physical Simulation and
  Functional Prediction
Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction
Yining Hong
Kaichun Mo
L. Yi
Leonidas J. Guibas
Antonio Torralba
J. Tenenbaum
Chuang Gan
42
5
0
05 May 2022
FindIt: Generalized Localization with Natural Language Queries
FindIt: Generalized Localization with Natural Language Queries
Weicheng Kuo
Fred Bertsch
Wei Li
A. Piergiovanni
M. Saffar
A. Angelova
ObjD
19
17
0
31 Mar 2022
Differentiated Relevances Embedding for Group-based Referring Expression
  Comprehension
Differentiated Relevances Embedding for Group-based Referring Expression Comprehension
Fuhai Chen
Xuri Ge
Xiaoshuai Sun
Yue Gao
Jianzhuang Liu
Feiyue Huang
Rongrong Ji
27
0
0
12 Mar 2022
COVR: A test-bed for Visually Grounded Compositional Generalization with
  real images
COVR: A test-bed for Visually Grounded Compositional Generalization with real images
Ben Bogin
Shivanshu Gupta
Matt Gardner
Jonathan Berant
CoGe
39
29
0
22 Sep 2021
YouRefIt: Embodied Reference Understanding with Language and Gesture
YouRefIt: Embodied Reference Understanding with Language and Gesture
Yixin Chen
Qing Li
Deqian Kong
Yik Lun Kei
Song-Chun Zhu
Tao Gao
Yixin Zhu
Siyuan Huang
LM&Ro
37
41
0
08 Sep 2021
A Better Loss for Visual-Textual Grounding
A Better Loss for Visual-Textual Grounding
Davide Rigoni
Luciano Serafini
A. Sperduti
ObjD
33
3
0
11 Aug 2021
Exploring Data Pipelines through the Process Lens: a Reference Model
  forComputer Vision
Exploring Data Pipelines through the Process Lens: a Reference Model forComputer Vision
Agathe Balayn
B. Kulynych
S. Guerses
21
4
0
05 Jul 2021
Understanding Synonymous Referring Expressions via Contrastive Features
Understanding Synonymous Referring Expressions via Contrastive Features
Yi-Wen Chen
Yi-Hsuan Tsai
Ming-Hsuan Yang
ObjD
27
4
0
20 Apr 2021
OCID-Ref: A 3D Robotic Dataset with Embodied Language for Clutter Scene
  Grounding
OCID-Ref: A 3D Robotic Dataset with Embodied Language for Clutter Scene Grounding
Ke-Jyun Wang
Yun-Hsuan Liu
Hung-Ting Su
Jen-Wei Wang
Yu-Siang Wang
Winston H. Hsu
Wen-Chin Chen
48
19
0
13 Mar 2021
Referring Expression Comprehension: A Survey of Methods and Datasets
Referring Expression Comprehension: A Survey of Methods and Datasets
Yanyuan Qiao
Chaorui Deng
Qi Wu
ObjD
50
93
0
19 Jul 2020
A Multi-View Embedding Space for Modeling Internet Images, Tags, and
  their Semantics
A Multi-View Embedding Space for Modeling Internet Images, Tags, and their Semantics
Yunchao Gong
Qifa Ke
Michael Isard
Svetlana Lazebnik
3DV
76
584
0
18 Dec 2012
1