ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.05499
  4. Cited By
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set
  Object Detection

Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection

9 March 2023
Shilong Liu
Zhaoyang Zeng
Tianhe Ren
Feng Li
Hao Zhang
Jie-jin Yang
Chun-yue Li
Jianwei Yang
Hang Su
Jun Zhu
Lei Zhang
    ObjD
ArXivPDFHTML

Papers citing "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

50 / 1,339 papers shown
Title
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present,
  and Future
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
Chaoyang Zhu
Long Chen
ObjD
VLM
31
33
0
18 Jul 2023
BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs
BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs
Yang Zhao
Zhijie Lin
Daquan Zhou
Zilong Huang
Jiashi Feng
Bingyi Kang
MLLM
44
108
0
17 Jul 2023
Sim2Plan: Robot Motion Planning via Message Passing between Simulation
  and Reality
Sim2Plan: Robot Motion Planning via Message Passing between Simulation and Reality
Yizhou Zhao
Yuanhong Zeng
Qiang Long
Ying Nian Wu
Song-Chun Zhu
35
0
0
15 Jul 2023
Open Scene Understanding: Grounded Situation Recognition Meets Segment
  Anything for Helping People with Visual Impairments
Open Scene Understanding: Grounded Situation Recognition Meets Segment Anything for Helping People with Visual Impairments
R. Liu
Jiaming Zhang
Kunyu Peng
Junwei Zheng
Ke Cao
Yufan Chen
Kailun Yang
Rainer Stiefelhagen
27
15
0
15 Jul 2023
OG: Equip vision occupancy with instance segmentation and visual
  grounding
OG: Equip vision occupancy with instance segmentation and visual grounding
Zichao Dong
Hang Ji
Weikun Zhang
Xufeng Huang
Junbo Chen
ISeg
VLM
34
0
0
12 Jul 2023
AutoDecoding Latent 3D Diffusion Models
AutoDecoding Latent 3D Diffusion Models
Evangelos Ntavelis
Aliaksandr Siarohin
Kyle Olszewski
Chao-Yuan Wang
Luc Van Gool
Sergey Tulyakov
DiffM
41
43
0
07 Jul 2023
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Shilong Zhang
Pei Sun
Shoufa Chen
Min Xiao
Wenqi Shao
Wenwei Zhang
Yu Liu
Kai-xiang Chen
Ping Luo
VLM
MLLM
85
225
0
07 Jul 2023
RefSAM: Efficiently Adapting Segmenting Anything Model for Referring
  Video Object Segmentation
RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object Segmentation
Yonglin Li
Jing Zhang
Xiao Teng
Long Lan
VOS
VLM
23
18
0
03 Jul 2023
Hierarchical Open-vocabulary Universal Image Segmentation
Hierarchical Open-vocabulary Universal Image Segmentation
Xudong Wang
Shufang Li
Konstantinos Kallidromitis
Yu Kato
Kazuki Kozuka
Trevor Darrell
VLM
OCL
51
37
0
03 Jul 2023
DisCo: Disentangled Control for Realistic Human Dance Generation
DisCo: Disentangled Control for Realistic Human Dance Generation
Tan Wang
Linjie Li
Kevin Qinghong Lin
Yuanhao Zhai
Chung-Ching Lin
Zhengyuan Yang
Hanwang Zhang
Zicheng Liu
Lijuan Wang
VGen
32
74
0
30 Jun 2023
Counting Guidance for High Fidelity Text-to-Image Synthesis
Counting Guidance for High Fidelity Text-to-Image Synthesis
Wonjune Kang
Kevin Galim
H. Koo
Nam Ik Cho
DiffM
36
8
0
30 Jun 2023
The Segment Anything Model (SAM) for Remote Sensing Applications: From
  Zero to One Shot
The Segment Anything Model (SAM) for Remote Sensing Applications: From Zero to One Shot
L. Osco
Qiusheng Wu
Eduardo Lopes de Lemos
W. Gonçalves
A. P. Ramos
Jonathan Li
J. M. Junior
VLM
16
181
0
29 Jun 2023
KITE: Keypoint-Conditioned Policies for Semantic Manipulation
KITE: Keypoint-Conditioned Policies for Semantic Manipulation
Priya Sundaresan
Suneel Belkhale
Dorsa Sadigh
Jeannette Bohg
LM&Ro
33
24
0
29 Jun 2023
RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation
  based on Visual Foundation Model
RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation based on Visual Foundation Model
Keyan Chen
Chenyang Liu
Hao Chen
Haotian Zhang
Wenyuan Li
Zhengxia Zou
Z. Shi
VLM
21
203
0
28 Jun 2023
Towards Open Vocabulary Learning: A Survey
Towards Open Vocabulary Learning: A Survey
Jianzong Wu
Xiangtai Li
Shilin Xu
Haobo Yuan
Henghui Ding
...
Jiangning Zhang
Yu Tong
Xudong Jiang
Guohao Li
Dacheng Tao
ObjD
VLM
45
136
0
28 Jun 2023
What a MESS: Multi-Domain Evaluation of Zero-Shot Semantic Segmentation
What a MESS: Multi-Domain Evaluation of Zero-Shot Semantic Segmentation
Benedikt Blumenstiel
Johannes Jakubik
Hilde Kuhne
Michael Vossing
VLM
32
15
0
27 Jun 2023
Shikra: Unleashing Multimodal LLM's Referential Dialogue Magic
Shikra: Unleashing Multimodal LLM's Referential Dialogue Magic
Ke Chen
Zhao Zhang
Weili Zeng
Richong Zhang
Feng Zhu
Rui Zhao
ObjD
44
598
0
27 Jun 2023
When Foundation Model Meets Federated Learning: Motivations, Challenges, and Future Directions
When Foundation Model Meets Federated Learning: Motivations, Challenges, and Future Directions
Weiming Zhuang
Chen Chen
Lingjuan Lyu
Chong Chen
Yaochu Jin
Lingjuan Lyu
AIFin
AI4CE
99
86
0
27 Jun 2023
Kosmos-2: Grounding Multimodal Large Language Models to the World
Kosmos-2: Grounding Multimodal Large Language Models to the World
Zhiliang Peng
Wenhui Wang
Li Dong
Y. Hao
Shaohan Huang
Shuming Ma
Furu Wei
MLLM
ObjD
VLM
50
702
0
26 Jun 2023
Faster Segment Anything: Towards Lightweight SAM for Mobile Applications
Faster Segment Anything: Towards Lightweight SAM for Mobile Applications
Chaoning Zhang
Dongshen Han
Yu Qiao
Jung Uk Kim
Sung-Ho Bae
Seungkyu Lee
Choong Seon Hong
VLM
41
329
0
25 Jun 2023
DesCo: Learning Object Recognition with Rich Language Descriptions
DesCo: Learning Object Recognition with Rich Language Descriptions
Liunian Harold Li
Zi-Yi Dou
Nanyun Peng
Kai-Wei Chang
ObjD
VLM
34
20
0
24 Jun 2023
Robustness of Segment Anything Model (SAM) for Autonomous Driving in
  Adverse Weather Conditions
Robustness of Segment Anything Model (SAM) for Autonomous Driving in Adverse Weather Conditions
Xinru Shan
Chaoning Zhang
VLM
33
12
0
23 Jun 2023
CorNav: Autonomous Agent with Self-Corrected Planning for Zero-Shot
  Vision-and-Language Navigation
CorNav: Autonomous Agent with Self-Corrected Planning for Zero-Shot Vision-and-Language Navigation
Xiwen Liang
Liang Ma
Shanshan Guo
Jianhua Han
Hang Xu
Shikui Ma
Xiaodan Liang
LM&Ro
LLMAG
90
4
0
17 Jun 2023
Seeing the World through Your Eyes
Seeing the World through Your Eyes
Hadi Alzayer
Kevin Zhang
Brandon Yushan Feng
Christopher A. Metzler
Jia-Bin Huang
CVBM
30
16
0
15 Jun 2023
Robustness Analysis on Foundational Segmentation Models
Robustness Analysis on Foundational Segmentation Models
Madeline Chantry Schiappa
Shehreen Azad
V. Sachidanand
Yunhao Ge
O. Mikšík
Yogesh S Rawat
Vibhav Vineet
OOD
VLM
AAML
30
6
0
15 Jun 2023
2nd Place Winning Solution for the CVPR2023 Visual Anomaly and Novelty
  Detection Challenge: Multimodal Prompting for Data-centric Anomaly Detection
2nd Place Winning Solution for the CVPR2023 Visual Anomaly and Novelty Detection Challenge: Multimodal Prompting for Data-centric Anomaly Detection
Yunkang Cao
Xiaohao Xu
Chen Sun
Y. Cheng
Liang Gao
Weiming Shen
44
1
0
15 Jun 2023
Towards AGI in Computer Vision: Lessons Learned from GPT and Large
  Language Models
Towards AGI in Computer Vision: Lessons Learned from GPT and Large Language Models
Lingxi Xie
Longhui Wei
Xiaopeng Zhang
Kaifeng Bi
Xiaotao Gu
Jianlong Chang
Qi Tian
41
7
0
14 Jun 2023
AssistGPT: A General Multi-modal Assistant that can Plan, Execute,
  Inspect, and Learn
AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn
Difei Gao
Lei Ji
Luowei Zhou
Kevin Lin
Joya Chen
Zihan Fan
Mike Zheng Shou
MLLM
35
72
0
14 Jun 2023
Robustness of SAM: Segment Anything Under Corruptions and Beyond
Robustness of SAM: Segment Anything Under Corruptions and Beyond
Yu Qiao
Chaoning Zhang
Taegoo Kang
Donghun Kim
Chenshuang Zhang
Choong Seon Hong
AAML
25
33
0
13 Jun 2023
detrex: Benchmarking Detection Transformers
detrex: Benchmarking Detection Transformers
Tianhe Ren
Siyi Liu
Feng Li
Hao Zhang
Ailing Zeng
...
Zhaoyang Zeng
Xianbiao Qi
Yuhui Yuan
Jianwei Yang
Lei Zhang
40
13
0
12 Jun 2023
Transferring Foundation Models for Generalizable Robotic Manipulation
Transferring Foundation Models for Generalizable Robotic Manipulation
Jiange Yang
Wenhui Tan
Chuhao Jin
Keling Yao
Bei Liu
Jianlong Fu
Ruihua Song
Gangshan Wu
Limin Wang
LM&Ro
57
6
0
09 Jun 2023
Matting Anything
Matting Anything
Jiacheng Li
Jitesh Jain
Humphrey Shi
VLM
36
16
0
08 Jun 2023
Modular Visual Question Answering via Code Generation
Modular Visual Question Answering via Code Generation
Sanjay Subramanian
Medhini Narasimhan
Kushal Khangaonkar
Kevin Kaichuang Yang
Arsha Nagrani
Cordelia Schmid
Andy Zeng
Trevor Darrell
Dan Klein
29
47
0
08 Jun 2023
Fine-Grained Visual Prompting
Fine-Grained Visual Prompting
Lingfeng Yang
Yueze Wang
Xiang Li
Xinlong Wang
Jian Yang
ObjD
VLM
37
61
0
07 Jun 2023
Matte Anything: Interactive Natural Image Matting with Segment Anything
  Models
Matte Anything: Interactive Natural Image Matting with Segment Anything Models
J. Yao
Xinggang Wang
Lang Ye
Wenyu Liu
28
38
0
07 Jun 2023
Recognize Anything: A Strong Image Tagging Model
Recognize Anything: A Strong Image Tagging Model
Youcai Zhang
Xinyu Huang
Jinyu Ma
Zhaoyang Li
Zhaochuan Luo
...
Tong Luo
Yaqian Li
Siyi Liu
Yandong Guo
Lei Zhang
VLM
35
225
0
06 Jun 2023
Zero-Shot 3D Shape Correspondence
Zero-Shot 3D Shape Correspondence
Ahmed Abdelreheem
Abdelrahman Eldesokey
M. Ovsjanikov
Peter Wonka
33
24
0
05 Jun 2023
LRVS-Fashion: Extending Visual Search with Referring Instructions
LRVS-Fashion: Extending Visual Search with Referring Instructions
Simon Lepage
Jérémie Mary
David Picard
25
1
0
05 Jun 2023
Understanding Segment Anything Model: SAM is Biased Towards Texture
  Rather than Shape
Understanding Segment Anything Model: SAM is Biased Towards Texture Rather than Shape
Chaoning Zhang
Yu Qiao
Shehbaz Tariq
Sheng Zheng
Chenshuang Zhang
Chenghao Li
Hyundong Shin
Choong Seon Hong
VLM
42
10
0
03 Jun 2023
Segment Anything in High Quality
Segment Anything in High Quality
Lei Ke
Mingqiao Ye
Martin Danelljan
Yifan Liu
Yu-Wing Tai
Chi-Keung Tang
Feng Yu
VLM
37
311
0
02 Jun 2023
LOWA: Localize Objects in the Wild with Attributes
LOWA: Localize Objects in the Wild with Attributes
Xiaoyuan Guo
Kezhen Chen
Jinmeng Rao
Yawen Zhang
Baochen Sun
Jie Yang
ObjD
46
2
0
31 May 2023
RealignDiff: Boosting Text-to-Image Diffusion Model with Coarse-to-fine
  Semantic Re-alignment
RealignDiff: Boosting Text-to-Image Diffusion Model with Coarse-to-fine Semantic Re-alignment
Guian Fang
Zutao Jiang
Jianhua Han
Guangsong Lu
Hang Xu
Shengcai Liao
Xiaodan Liang
EGVM
29
1
0
31 May 2023
Multi-modal Queried Object Detection in the Wild
Multi-modal Queried Object Detection in the Wild
Yifan Xu
Mengdan Zhang
Chaoyou Fu
Peixian Chen
Xiaoshan Yang
Ke Li
Changsheng Xu
ObjD
VLM
38
30
0
30 May 2023
GPT4Tools: Teaching Large Language Model to Use Tools via
  Self-instruction
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction
Rui Yang
Lin Song
Yanwei Li
Sijie Zhao
Yixiao Ge
Xiu Li
Ying Shan
SyDa
MLLM
36
209
0
30 May 2023
Contextual Object Detection with Multimodal Large Language Models
Contextual Object Detection with Multimodal Large Language Models
Yuhang Zang
Wei Li
Jun Han
Kaiyang Zhou
Chen Change Loy
ObjD
VLM
MLLM
41
78
0
29 May 2023
Gen-L-Video: Multi-Text to Long Video Generation via Temporal
  Co-Denoising
Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising
Fu Lee Wang
Wenshuo Chen
Guanglu Song
Han-Jia Ye
Yu Liu
Hongsheng Li
VGen
DiffM
50
89
0
29 May 2023
InstructEdit: Improving Automatic Masks for Diffusion-based Image
  Editing With User Instructions
InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions
Qian Wang
Biao Zhang
Michael Birsak
Peter Wonka
DiffM
30
31
0
29 May 2023
Z-GMOT: Zero-shot Generic Multiple Object Tracking
Z-GMOT: Zero-shot Generic Multiple Object Tracking
Kim Hoang Tran
Anh Duy Le Dinh
Tien-Phat Nguyen
Thinh Phan
Pha Nguyen
Khoa Luu
Don Adjeroh
Gianfranco Doretto
Ngan Hoang Le
VOT
36
5
0
28 May 2023
Text-to-image Editing by Image Information Removal
Text-to-image Editing by Image Information Removal
Zhongping Zhang
Jian Zheng
Jacob Zhiyuan Fang
Bryan A. Plummer
DiffM
31
12
0
27 May 2023
Building One-class Detector for Anything: Open-vocabulary Zero-shot OOD
  Detection Using Text-image Models
Building One-class Detector for Anything: Open-vocabulary Zero-shot OOD Detection Using Text-image Models
Yunhao Ge
Jie Jessie Ren
Jiaping Zhao
Kaifeng Chen
Andrew Gallagher
Laurent Itti
Balaji Lakshminarayanan
VLM
ObjD
26
1
0
26 May 2023
Previous
123...252627
Next