ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.01795
  4. Cited By
PACO: Parts and Attributes of Common Objects

PACO: Parts and Attributes of Common Objects

4 January 2023
Vignesh Ramanathan
Anmol Kalia
Vladan Petrovic
Yiqian Wen
Baixue Zheng
Baishan Guo
Rui Wang
Aaron Marquez
Rama Kovvuri
Abhishek Kadian
Amir Mousavi
Yi-Zhe Song
Abhimanyu Dubey
D. Mahajan
    VLM
ArXivPDFHTML

Papers citing "PACO: Parts and Attributes of Common Objects"

50 / 82 papers shown
Title
RoboOS: A Hierarchical Embodied Framework for Cross-Embodiment and Multi-Agent Collaboration
RoboOS: A Hierarchical Embodied Framework for Cross-Embodiment and Multi-Agent Collaboration
Huajie Tan
Xiaoshuai Hao
Minglan Lin
Pengwei Wang
Yaoxu Lyu
Mingyu Cao
Zhongyuan Wang
S. Zhang
LM&Ro
48
0
0
06 May 2025
Visual Intention Grounding for Egocentric Assistants
Visual Intention Grounding for Egocentric Assistants
Pengzhan Sun
Junbin Xiao
Tze Ho Elden Tse
Yicong Li
Arjun Akula
Angela Yao
EgoV
52
0
0
18 Apr 2025
Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation
Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation
Yongchao Feng
Yajie Liu
Shuai Yang
Wenrui Cai
Jingyang Zhang
...
Jiahui Lv
Ziqiang Liu
Tengyuan Shi
Qingjie Liu
Yixuan Wang
MLLM
VLM
63
1
0
13 Apr 2025
Learning Object Focused Attention
Learning Object Focused Attention
Vivek Trivedy
A. Almalki
Longin Jan Latecki
36
0
0
10 Apr 2025
D-Feat Occlusions: Diffusion Features for Robustness to Partial Visual Occlusions in Object Recognition
D-Feat Occlusions: Diffusion Features for Robustness to Partial Visual Occlusions in Object Recognition
Rupayan Mallick
Sibo Dong
Nataniel Ruiz
Sarah Adel Bargal
DiffM
49
0
0
08 Apr 2025
URECA: Unique Region Caption Anything
URECA: Unique Region Caption Anything
Sangbeom Lim
J. Kim
Heeji Yoon
Jaewoo Jung
Seungryong Kim
31
0
0
07 Apr 2025
Towards Unified Referring Expression Segmentation Across Omni-Level Visual Target Granularities
Towards Unified Referring Expression Segmentation Across Omni-Level Visual Target Granularities
Jing Liu
Wenxuan Wang
Yisi Zhang
Yepeng Tang
Xingjian He
Longteng Guo
Tongtian Yue
Xinlong Wang
ObjD
53
0
0
02 Apr 2025
POPEN: Preference-Based Optimization and Ensemble for LVLM-Based Reasoning Segmentation
POPEN: Preference-Based Optimization and Ensemble for LVLM-Based Reasoning Segmentation
Lanyun Zhu
Tianrun Chen
Qianxiong Xu
Xuanyi Liu
Deyi Ji
Haiyang Wu
De Wen Soh
Xiaozhong Liu
VLM
LRM
50
0
0
01 Apr 2025
MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segmentation
MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segmentation
Donggon Jang
Yucheol Cho
Suin Lee
Taehyeon Kim
Dae-Shik Kim
VLM
65
1
0
18 Mar 2025
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
Tao Wang
Changxu Cheng
Lingfeng Wang
Senda Chen
Wuyue Zhao
VLM
72
0
0
17 Mar 2025
GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding
GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding
R. Hu
Lianghui Zhu
Yuxuan Zhang
Tianheng Cheng
Lei Liu
Heng Liu
Longjin Ran
Xiaoxin Chen
Wenyu Liu
Xinggang Wang
ObjD
61
0
0
13 Mar 2025
UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface
Hao Tang
Chenwei Xie
Haiyang Wang
Xiaoyi Bao
Tingyu Weng
Pandeng Li
Yun Zheng
Liwei Wang
ObjD
VLM
62
0
0
03 Mar 2025
QueryAdapter: Rapid Adaptation of Vision-Language Models in Response to Natural Language Queries
QueryAdapter: Rapid Adaptation of Vision-Language Models in Response to Natural Language Queries
N. H. Chapman
Feras Dayoub
Will N. Browne
Christopher F. Lehnert
VLM
77
0
0
26 Feb 2025
Pixel-Level Reasoning Segmentation via Multi-turn Conversations
Pixel-Level Reasoning Segmentation via Multi-turn Conversations
Dexian Cai
Xiaocui Yang
Yongkang Liu
Daling Wang
Shi Feng
Yifei Zhang
Soujanya Poria
LRM
87
0
0
13 Feb 2025
The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
Sitong Gong
Yunzhi Zhuge
Lu Zhang
Z. Yang
Pingping Zhang
Huchuan Lu
41
0
0
15 Jan 2025
Guided SAM: Label-Efficient Part Segmentation
Guided SAM: Label-Efficient Part Segmentation
S.B. van Rooij
G.J. Burghouts
VLM
43
0
0
13 Jan 2025
Accounting for Focus Ambiguity in Visual Questions
Chongyan Chen
Yu-Yun Tseng
Zhuoheng Li
Anush Venkatesh
Danna Gurari
41
0
0
04 Jan 2025
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
Jiannan Wu
Muyan Zhong
Sen Xing
Zeqiang Lai
Zhaoyang Liu
...
Lewei Lu
Tong Lu
Ping Luo
Yu Qiao
Jifeng Dai
MLLM
VLM
LRM
99
48
0
03 Jan 2025
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
Qing Jiang
Gen Luo
Yuqin Yang
Yuda Xiong
Yihao Chen
Zhaoyang Zeng
Tianhe Ren
Lei Zhang
VLM
LRM
109
6
0
27 Nov 2024
VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
Shehan Munasinghe
Hanan Gani
Wenqi Zhu
Jiale Cao
Eric P. Xing
F. Khan
Salman Khan
MLLM
VGen
VLM
44
6
0
07 Nov 2024
SegLLM: Multi-round Reasoning Segmentation
SegLLM: Multi-round Reasoning Segmentation
XuDong Wang
Shaolun Zhang
Shufan Li
Konstantinos Kallidromitis
Kehan Li
Yusuke Kato
Kazuki Kozuka
Trevor Darrell
VLM
LRM
50
1
0
24 Oct 2024
Bridge the Points: Graph-based Few-shot Segment Anything Semantically
Bridge the Points: Graph-based Few-shot Segment Anything Semantically
Anqi Zhang
Guangyu Gao
Jianbo Jiao
C. Liu
Yunchao Wei
VLM
31
3
0
09 Oct 2024
TROPE: TRaining-Free Object-Part Enhancement for Seamlessly Improving
  Fine-Grained Zero-Shot Image Captioning
TROPE: TRaining-Free Object-Part Enhancement for Seamlessly Improving Fine-Grained Zero-Shot Image Captioning
Joshua Forster Feinglass
Yezhou Yang
31
0
0
30 Sep 2024
One Token to Seg Them All: Language Instructed Reasoning Segmentation in
  Videos
One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos
Zechen Bai
Tong He
Haiyang Mei
Pichao Wang
Ziteng Gao
Joya Chen
Lei Liu
Zheng Zhang
Mike Zheng Shou
VLM
VOS
MLLM
45
17
0
29 Sep 2024
Instruction-guided Multi-Granularity Segmentation and Captioning with
  Large Multimodal Model
Instruction-guided Multi-Granularity Segmentation and Captioning with Large Multimodal Model
Li Zhou
Xu Yuan
Zenghui Sun
Zikun Zhou
Jingsong Lan
VLM
MLLM
127
3
0
20 Sep 2024
SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring
  Expression Segmentation
SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation
Yi-Chia Chen
Wei-Hua Li
Cheng Sun
Yu-Chiang Frank Wang
Chu-Song Chen
VLM
39
11
0
01 Sep 2024
OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding
OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding
Youjun Zhao
Jiaying Lin
Shuquan Ye
Qianshi Pang
Rynson W. H. Lau
64
1
0
20 Aug 2024
Visual Agents as Fast and Slow Thinkers
Visual Agents as Fast and Slow Thinkers
Guangyan Sun
Mingyu Jin
Zhenting Wang
Cheng-Long Wang
Siqi Ma
Qifan Wang
Ying Nian Wu
Ying Nian Wu
Dongfang Liu
Dongfang Liu
LLMAG
LRM
79
13
0
16 Aug 2024
PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects
PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects
Junyi Li
Junfeng Wu
Weizhi Zhao
Song Bai
Xiang Bai
41
1
0
23 Jul 2024
PartImageNet++ Dataset: Scaling up Part-based Models for Robust
  Recognition
PartImageNet++ Dataset: Scaling up Part-based Models for Robust Recognition
Xiao-Li Li
Yining Liu
Na Dong
Sitian Qin
Xiaolin Hu
41
3
0
15 Jul 2024
WPS-SAM: Towards Weakly-Supervised Part Segmentation with Foundation
  Models
WPS-SAM: Towards Weakly-Supervised Part Segmentation with Foundation Models
Xin-Jian Wu
Rui-Song Zhang
Jie Qin
Shijie Ma
Cheng-Lin Liu
VLM
32
1
0
14 Jul 2024
SPIN: Hierarchical Segmentation with Subpart Granularity in Natural
  Images
SPIN: Hierarchical Segmentation with Subpart Granularity in Natural Images
Josh Myers-Dean
Jarek Reynolds
Brian Price
Yifei Fan
Danna Gurari
46
2
0
12 Jul 2024
3x2: 3D Object Part Segmentation by 2D Semantic Correspondences
3x2: 3D Object Part Segmentation by 2D Semantic Correspondences
Anh Thai
Weiyao Wang
Hao Tang
Stefan Stojanov
Matt Feiszli
James M. Rehg
3DPC
47
3
0
12 Jul 2024
Segment Anything without Supervision
Segment Anything without Supervision
Xudong Wang
Jingfeng Yang
Trevor Darrell
VLM
43
10
0
28 Jun 2024
EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model
EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model
Yuxuan Zhang
Tianheng Cheng
Lianghui Zhu
Lei Liu
Heng Liu
Longjin Ran
Xiaoxin Chen
Xiaoxin Chen
Wenyu Liu
Xinggang Wang
VLM
61
25
0
28 Jun 2024
MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning
MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning
Xiangyu Zhao
Xiangtai Li
Haodong Duan
Haian Huang
Yining Li
Kai Chen
Hua Yang
VLM
MLLM
45
10
0
25 Jun 2024
MAC: A Benchmark for Multiple Attributes Compositional Zero-Shot Learning
MAC: A Benchmark for Multiple Attributes Compositional Zero-Shot Learning
Shuo Xu
Sai Wang
Xinyue Hu
Yutian Lin
Bo Du
Yu Wu
CoGe
59
0
0
18 Jun 2024
Inpainting the Gaps: A Novel Framework for Evaluating Explanation
  Methods in Vision Transformers
Inpainting the Gaps: A Novel Framework for Evaluating Explanation Methods in Vision Transformers
Lokesh Badisa
Sumohana S. Channappayya
45
0
0
17 Jun 2024
Open-Vocabulary Part-Based Grasping
Open-Vocabulary Part-Based Grasping
Tjeard van Oort
Dimity Miller
Will N. Browne
Nicolas Marticorena
Jesse Haviland
Niko Suenderhauf
3DPC
32
2
0
10 Jun 2024
F-LMM: Grounding Frozen Large Multimodal Models
F-LMM: Grounding Frozen Large Multimodal Models
Size Wu
Sheng Jin
Wenwei Zhang
Lumin Xu
Wentao Liu
Wei Li
Chen Change Loy
MLLM
80
12
0
09 Jun 2024
SOHES: Self-supervised Open-world Hierarchical Entity Segmentation
SOHES: Self-supervised Open-world Hierarchical Entity Segmentation
Shengcao Cao
Jiuxiang Gu
Jason Kuen
Hao Tan
Ruiyi Zhang
Handong Zhao
A. Nenkova
Liangyan Gui
Tong Sun
Yu-Xiong Wang
VLM
OCL
46
3
0
18 Apr 2024
Resilience through Scene Context in Visual Referring Expression
  Generation
Resilience through Scene Context in Visual Referring Expression Generation
Simeon Junker
Sina Zarrieß
36
0
0
18 Apr 2024
CoReS: Orchestrating the Dance of Reasoning and Segmentation
CoReS: Orchestrating the Dance of Reasoning and Segmentation
Xiaoyi Bao
Siyang Sun
Shuailei Ma
Kecheng Zheng
Yuxin Guo
Guosheng Zhao
Yun Zheng
Xingang Wang
LRM
36
7
0
08 Apr 2024
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Weifeng Lin
Xinyu Wei
Ruichuan An
Peng Gao
Bocheng Zou
Yulin Luo
Siyuan Huang
Shanghang Zhang
Hongsheng Li
VLM
69
33
0
29 Mar 2024
PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model
PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model
Zheng-Wei Zhang
Yeyao Ma
Enming Zhang
Xiang Bai
VLM
MLLM
39
30
0
21 Mar 2024
Empowering Segmentation Ability to Multi-modal Large Language Models
Empowering Segmentation Ability to Multi-modal Large Language Models
Yuqi Yang
Peng-Tao Jiang
Jing Wang
Hao Zhang
Kai Zhao
Jinwei Chen
Bo-wen Li
LRM
VLM
27
3
0
21 Mar 2024
ManipVQA: Injecting Robotic Affordance and Physically Grounded
  Information into Multi-Modal Large Language Models
ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models
Siyuan Huang
Iaroslav Ponomarenko
Zhengkai Jiang
Xiaoqi Li
Xiaobin Hu
Peng Gao
Hongsheng Li
Hao Dong
LM&Ro
37
16
0
17 Mar 2024
Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception
Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception
Jun-Yan He
Yifan Wang
Lijun Wang
Huchuan Lu
Jun-Yan He
Jinpeng Lan
Bin Luo
Xuansong Xie
MLLM
VLM
37
19
0
05 Mar 2024
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation
Yichi Zhang
Ziqiao Ma
Xiaofeng Gao
Suhaila Shakiah
Qiaozi Gao
Joyce Chai
MLLM
VLM
42
39
0
26 Feb 2024
LLMBind: A Unified Modality-Task Integration Framework
LLMBind: A Unified Modality-Task Integration Framework
Bin Zhu
Munan Ning
Peng Jin
Bin Lin
Jinfa Huang
...
Junwu Zhang
Zhenyu Tang
Mingjun Pan
Xing Zhou
Li-ming Yuan
MLLM
40
6
0
22 Feb 2024
12
Next