ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.17048
  4. Cited By
Zero-shot Referring Expression Comprehension via Structural Similarity
  Between Images and Captions

Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions

28 November 2023
Zeyu Han
Fangrui Zhu
Qianru Lao
Huaizu Jiang
    ObjD
ArXivPDFHTML

Papers citing "Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions"

13 / 13 papers shown
Title
Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding
Seil Kang
Jinyeong Kim
Junhyeok Kim
Seong Jae Hwang
VLM
96
2
0
08 Mar 2025
IteRPrimE: Zero-shot Referring Image Segmentation with Iterative Grad-CAM Refinement and Primary Word Emphasis
Yishuo Wang
Jingchen Ni
Yong-Jin Liu
Chun Yuan
Yansong Tang
58
1
0
02 Mar 2025
Towards Visual Grounding: A Survey
Towards Visual Grounding: A Survey
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
67
4
0
31 Dec 2024
Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding
Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding
Zilin Du
Haoxin Li
Jianfei Yu
Boyang Li
227
0
0
01 Dec 2024
MC-Bench: A Benchmark for Multi-Context Visual Grounding in the Era of
  MLLMs
MC-Bench: A Benchmark for Multi-Context Visual Grounding in the Era of MLLMs
Yunqiu Xu
Linchao Zhu
Yi Yang
32
3
0
16 Oct 2024
Towards Flexible Visual Relationship Segmentation
Towards Flexible Visual Relationship Segmentation
Fangrui Zhu
Jianwei Yang
Huaizu Jiang
VOS
34
1
0
15 Aug 2024
Learning Visual Grounding from Generative Vision and Language Model
Learning Visual Grounding from Generative Vision and Language Model
Shijie Wang
Dahun Kim
A. Taalimi
Chen Sun
Weicheng Kuo
ObjD
36
6
0
18 Jul 2024
The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report
The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report
Bin Ren
Yawei Li
Nancy Mehta
Radu Timofte
Hongyuan Yu
...
P. Yashaswini
Chaitra Desai
R. Tabib
Ujwala Patil
U. Mudenagudi
SupR
52
35
0
16 Apr 2024
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Zeyu Han
Chao Gao
Jinyang Liu
Jeff Zhang
Sai Qian Zhang
150
319
0
21 Mar 2024
DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of
  MLLM
DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM
YiXuan Wu
Yizhou Wang
Shixiang Tang
Wenhao Wu
Tong He
Wanli Ouyang
Jian Wu
Philip Torr
ObjD
VLM
32
19
0
19 Mar 2024
CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models
CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models
Yuan Yao
Ao Zhang
Zhengyan Zhang
Zhiyuan Liu
Tat-Seng Chua
Maosong Sun
MLLM
VPVLM
VLM
211
221
0
24 Sep 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
67
255
0
14 Jul 2021
Improving Distantly-Supervised Relation Extraction through BERT-based
  Label & Instance Embeddings
Improving Distantly-Supervised Relation Extraction through BERT-based Label & Instance Embeddings
Despina Christou
Grigorios Tsoumakas
29
39
0
01 Feb 2021
1