Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.18816
Cited By
Reasoning Segmentation for Images and Videos: A Survey
24 May 2025
Yiqing Shen
Chenjia Li
Fei Xiong
Jeong-O Jeong
Tianpeng Wang
Michael Latman
Mathias Unberath
VOS
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Reasoning Segmentation for Images and Videos: A Survey"
50 / 67 papers shown
Title
RVTBench: A Benchmark for Visual Reasoning Tasks
Yiqing Shen
Chenjia Li
Chenxiao Fan
Mathias Unberath
CoGe
VLM
LRM
44
1
0
17 May 2025
Position: Foundation Models Need Digital Twin Representations
Yiqing Shen
Hao Ding
Lalithkumar Seenivasan
Tianmin Shu
Mathias Unberath
AI4CE
130
2
0
01 May 2025
SegEarth-R1: Geospatial Pixel Reasoning via Large Language Model
Kaiyu Li
Zepeng Xin
Li Pang
Chao Pang
Yupeng Deng
Jing Yao
Guisong Xia
Deyu Meng
Zhi Wang
Xiangyong Cao
VLM
LRM
86
4
0
13 Apr 2025
Online Reasoning Video Segmentation with Just-in-Time Digital Twins
Yiqing Shen
Bohan Liu
Chenjia Li
Lalithkumar Seenivasan
Mathias Unberath
VOS
166
4
0
27 Mar 2025
Operating Room Workflow Analysis via Reasoning Segmentation over Digital Twins
Yiqing Shen
Chenjia Li
Bohan Liu
Cheng-Yi Li
Tito Porras
Mathias Unberath
76
4
0
26 Mar 2025
Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement
Yuqi Liu
Bohao Peng
Zhisheng Zhong
Zihao Yue
Fanbin Lu
Bei Yu
Jiaya Jia
LRM
VLM
121
46
0
09 Mar 2025
Pixel-Level Reasoning Segmentation via Multi-turn Conversations
Dexian Cai
Xiaocui Yang
Yongkang Liu
Daling Wang
Shi Feng
Yifei Zhang
Soujanya Poria
LRM
104
1
0
13 Feb 2025
The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
Sitong Gong
Yunzhi Zhuge
Lu Zhang
Zhiyong Yang
Pingping Zhang
Huchuan Lu
85
3
0
15 Jan 2025
PRIMA: Multi-Image Vision-Language Models for Reasoning Segmentation
Muntasir Wahed
Kiet A. Nguyen
Adheesh Sunil Juvekar
Xinzhuo Li
Xiaona Zhou
Vedant Shah
Tianjiao Yu
Pinar Yanardag
Ismini Lourentzou
VLM
LRM
92
2
0
19 Dec 2024
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Andong Deng
Tongjia Chen
Shoubin Yu
Taojiannan Yang
Lincoln Spencer
Yapeng Tian
Ajmal Mian
Joey Tianyi Zhou
Chen Chen
LRM
100
3
0
15 Nov 2024
One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos
Zechen Bai
Tong He
Haiyang Mei
Pichao Wang
Ziteng Gao
Joya Chen
Lei Liu
Zheng Zhang
Mike Zheng Shou
VLM
VOS
MLLM
91
27
0
29 Sep 2024
ViLLa: Video Reasoning Segmentation with Large Language Model
Rongkun Zheng
Lu Qi
Xi Chen
Yi Wang
Kun Wang
Yu Qiao
Hengshuang Zhao
VOS
LRM
153
5
0
18 Jul 2024
LLM-Seg: Bridging Image Segmentation and Large Language Model Reasoning
Junchi Wang
Lei Ke
MLLM
LRM
VLM
73
29
0
12 Apr 2024
LaSagnA: Language-based Segmentation Assistant for Complex Queries
Cong Wei
Haoxian Tan
Yujie Zhong
Yujiu Yang
Lin Ma
102
17
0
12 Apr 2024
CoReS: Orchestrating the Dance of Reasoning and Segmentation
Xiaoyi Bao
Siyang Sun
Shuailei Ma
Kecheng Zheng
Yuxin Guo
Guosheng Zhao
Yun Zheng
Xingang Wang
LRM
103
10
0
08 Apr 2024
Empowering Segmentation Ability to Multi-modal Large Language Models
Yuqi Yang
Peng-Tao Jiang
Jing Wang
Hao Zhang
Kai Zhao
Jinwei Chen
Yue Liu
LRM
VLM
78
4
0
21 Mar 2024
A Survey for Foundation Models in Autonomous Driving
Haoxiang Gao
Yaqian Li
Kaiwen Long
Ming Yang
Yiqing Shen
VLM
LRM
100
31
0
02 Feb 2024
Video Anomaly Detection and Explanation via Large Language Models
Hui Lv
Qianru Sun
70
30
0
11 Jan 2024
Tracking with Human-Intent Reasoning
Jiawen Zhu
Zhi-Qi Cheng
Jun-Yan He
Chenyang Li
Bin Luo
Huchuan Lu
Yifeng Geng
Xuansong Xie
LRM
VOS
83
11
0
29 Dec 2023
LISA++: An Improved Baseline for Reasoning Segmentation with Large Language Model
Senqiao Yang
Tianyuan Qu
Xin Lai
Zhuotao Tian
Bohao Peng
Shu Liu
Jiaya Jia
VLM
96
32
0
28 Dec 2023
See, Say, and Segment: Teaching LMMs to Overcome False Premises
Tsung-Han Wu
Giscard Biamby
David M. Chan
Lisa Dunlap
Ritwik Gupta
Xudong Wang
Joseph E. Gonzalez
Trevor Darrell
VLM
MLLM
108
21
0
13 Dec 2023
PixelLM: Pixel Reasoning with Large Multimodal Model
Zhongwei Ren
Zhicheng Huang
Yunchao Wei
Yao-Min Zhao
Dongmei Fu
Jiashi Feng
Xiaojie Jin
VLM
MLLM
LRM
101
108
0
04 Dec 2023
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
Qidong Huang
Xiao-wen Dong
Pan Zhang
Bin Wang
Conghui He
Jiaqi Wang
Dahua Lin
Weiming Zhang
Neng H. Yu
MLLM
130
206
0
29 Nov 2023
GLaMM: Pixel Grounding Large Multimodal Model
H. Rasheed
Muhammad Maaz
Sahal Shaji Mullappilly
Abdelrahman M. Shaker
Salman Khan
Hisham Cholakkal
Rao M. Anwer
Erix Xing
Ming-Hsuan Yang
Fahad S. Khan
MLLM
VLM
138
238
0
06 Nov 2023
OV-PARTS: Towards Open-Vocabulary Part Segmentation
Meng Wei
Xiaoyu Yue
Wenwei Zhang
Shu Kong
Xihui Liu
Jiangmiao Pang
VLM
51
25
0
08 Oct 2023
EgoObjects: A Large-Scale Egocentric Dataset for Fine-Grained Object Understanding
Chenchen Zhu
Fanyi Xiao
Andres Alvarado
Yasmine Babaei
Jiabo Hu
Hichem El-Mohri
Sean Culatana
Roshan Sumbaly
Zhicheng Yan
EgoV
79
22
0
15 Sep 2023
MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
Henghui Ding
Chang Liu
Shuting He
Xudong Jiang
Chen Change Loy
VOS
123
116
0
16 Aug 2023
XMem++: Production-level Video Segmentation From Few Annotated Frames
Maksym Bekuzarov
Ariana Bermúdez
Joon-Young Lee
Haohe Li
VLM
VOS
70
39
0
29 Jul 2023
Vocabulary-free Image Classification
Alessandro Conti
Enrico Fini
Massimiliano Mancini
Paolo Rota
Yiming Wang
Elisa Ricci
VLM
109
27
0
01 Jun 2023
Towards Open-Vocabulary Video Instance Segmentation
Haochen Wang
Cilin Yan
Shuailong Wang
Xiaolong Jiang
XU Tang
Yao Hu
Weidi Xie
E. Gavves
VOS
VLM
79
34
0
04 Apr 2023
Vision-Language Models for Vision Tasks: A Survey
Jingyi Zhang
Jiaxing Huang
Sheng Jin
Shijian Lu
VLM
158
550
0
03 Apr 2023
EVA-CLIP: Improved Training Techniques for CLIP at Scale
Quan-Sen Sun
Yuxin Fang
Ledell Yu Wu
Xinlong Wang
Yue Cao
CLIP
VLM
149
513
0
27 Mar 2023
MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
Henghui Ding
Chang Liu
Shuting He
Xudong Jiang
Philip Torr
S. Bai
VOS
103
145
0
03 Feb 2023
PACO: Parts and Attributes of Common Objects
Vignesh Ramanathan
Anmol Kalia
Vladan Petrovic
Yiqian Wen
Baixue Zheng
...
Abhishek Kadian
Amir Mousavi
Yi-Zhe Song
Abhimanyu Dubey
D. Mahajan
VLM
89
105
0
04 Jan 2023
XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
Ho Kei Cheng
Alex Schwing
VLM
VOS
107
410
0
14 Jul 2022
Towards Robust Referring Video Object Segmentation with Cyclic Relational Consensus
Xiang Li
Jinglu Wang
Xiaohao Xu
Xiao Li
Bhiksha Raj
Yan Lu
VOS
103
37
0
04 Jul 2022
VLP: A Survey on Vision-Language Pre-training
Feilong Chen
Duzhen Zhang
Minglun Han
Xiuyi Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
176
224
0
18 Feb 2022
Masked-attention Mask Transformer for Universal Image Segmentation
Bowen Cheng
Ishan Misra
Alex Schwing
Alexander Kirillov
Rohit Girdhar
ISeg
274
2,385
0
02 Dec 2021
Panoptic Segmentation: A Review
O. Elharrouss
S. Al-Maadeed
Nandhini Subramanian
Najmath Ottakath
Noor Almaadeed
Yassine Himeur
75
41
0
19 Nov 2021
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
504
10,526
0
17 Jun 2021
Part-aware Panoptic Segmentation
Daan de Geus
Panagiotis Meletis
Chenyang Lu
Xiaoxiao Wen
Gijs Dubbelman
95
62
0
11 Jun 2021
Unidentified Video Objects: A Benchmark for Dense, Open-World Segmentation
Weiyao Wang
Matt Feiszli
Heng Wang
Du Tran
VOS
78
127
0
10 Apr 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
1.0K
29,926
0
26 Feb 2021
Occluded Video Instance Segmentation: A Benchmark
Jiyang Qi
Yan Gao
Yao Hu
Xinggang Wang
Xiaoyu Liu
Xiang Bai
Serge Belongie
Alan Yuille
Philip Torr
S. Bai
VOS
VLM
91
140
0
02 Feb 2021
Reducing the Annotation Effort for Video Object Segmentation Datasets
P. Voigtlaender
Lishu Luo
C. Yuan
Yong Jiang
Bastian Leibe
VOS
123
20
0
02 Nov 2020
PhraseCut: Language-based Image Segmentation in the Wild
Chenyun Wu
Zhe Lin
Scott D. Cohen
Trung Bui
Subhransu Maji
VLM
69
115
0
03 Aug 2020
A Survey on Instance Segmentation: State of the art
A. M. Hafiz
G. M. Bhat
SSeg
ISeg
84
436
0
28 Jun 2020
End-to-End Object Detection with Transformers
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
ViT
3DV
PINN
456
13,130
0
26 May 2020
TAO: A Large-Scale Benchmark for Tracking Any Object
Achal Dave
Tarasha Khurana
P. Tokmakov
Cordelia Schmid
Deva Ramanan
75
180
0
20 May 2020
TextCaps: a Dataset for Image Captioning with Reading Comprehension
Oleksii Sidorov
Ronghang Hu
Marcus Rohrbach
Amanpreet Singh
92
418
0
24 Mar 2020
1
2
Next